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THE NATURE, VARIETY, AND SOCIAL PATTERNING OF MORAL 


RESPONSES TO T 


RANSGRESSION 


JUSTIN ARONFREED! 


University of Pennsylvania 


HE most characteristic feature of 

contemporary treatments of moral 

development is undoubtedly the em- 
phasis on the role of moral judgment. Piaget’s 
(1948) original observations on the child’s 
growing awareness of morality have been 
followed by numerous assessments (Boehm, 
1957; Durkin, 1959; Kohlberg, 1958; MacRae, 
1954; Peel, 1959) of children’s perceptions of 
social norms, of their conditions of application, 
and of their sources of origin. Of course, the 
knowledge of standards of conduct, as it is 
conveyed through verbal report, is a step 
removed from actual social behavior. But 
psychological conceptions of morality that 
have been more directly concerned with its 
behavioral manifestations have also commonly 
reserved the term moral for behavior that 
appears to rest upon standards with respect to 
which actions, thoughts, or feelings may be 
evaluated as “‘good” and “bad” or “right” and 
“wrong.” The concept of guilt, so uniformly 
accorded a central position in theoretical 
accounts of internalized moral responses to 
transgression, reflects the assumption that 
judgment is the base of morality. The clinical 
impetus to psychologists’ concern with guilt 
has sometimes led to a focus on its affective or 
motivational properties. This focus should not, 
however, be permitted to obscure the fact that 
guilt ordinarily designates a psychological 
phenomenon having a cognitive aspect of 
self-evaluation. 

One finds in the work of many psychologists 
the view that guilt rests upon self-criticism 
and is a prerequisite of moral reactions to 
transgression which are internalized in the 
sense that they occur even in the absence of 
any explicit or threatened external punish- 
ment. Freud (1936), for example, clearly 
distinguished the self-mediated consequences 

‘The author is indebted to the Eastern Pennsylvania 
Psychiatric Institute, Philadelphia, Pennsylvania, for 
i$ support of the initial phase of this work, and to 
Helene Aaronson, Nina Chaiken, Joseph Denny, and 


Kirby Smith for their assistance in the collection of 
cata. 
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of transgression from those dependent upon 
external sanctions. He portrayed the in- 
dividual’s conscience an autonomous 
internalized representation of prohibitions and 
punishments formerly present in his parents’ 
behavior. The same kind of distinction 
made by Sears, Maccoby, and Levin (1957), 
who used children’s expressions of guilt as 
evidence that they were following the self- 
instruction of their own standards and not 
merely anticipating the external consequences 
of their misdeeds. In a cross-cultural study of a 
number of different societies, Whiting and 
Child (1953) similarly described two distinct 
mechanisms of social control, one of “moral 
anxiety” or guilt and the second of ‘objective 
anxiety” or fear of external punishment. 

Other investigators have concurred in 
equating internalized moral reactions with the 
presence of self-criticism. Henry and Short 
(1954), in their study of societal influences 
upon the direction of aggression, speculated 
that self-punitive tendencies were propor- 
tionate to the extent of internalization of 
parental sanctions. Heinecke (1953) took the 
intensity of expression of guilt, in a group of 
4- and 5-year-olds, as a measure of “strength 
of superego formation.” And Allinsmith 
(1960) has attempted to relate a variety of the 
moral reactions of older children to an exten- 
sive network of self-evaluative standards. All 
of these approaches share the view that a 
response to transgression is moral when it 
follows from the child’s judgment of his own 
behavior and no longer requires the support 
of external norms and sanctions. 

Internalized responses to transgression 
comprise, of course, only one set of phenomena 
to be understood in any general account of 
moral behavior. But these responses provide a 
fundamental insight into the nature of morality 
in that they assume a variety of forms many 
of which reveal an absence of self-evaluation 
and an external orientation inconsistent with 
what one might expect if they were entirely 
internally mediated. Experiments by Levin and 
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Turgeon (1957), and by Siegel and Kohn 
(1959), have demonstrated, in the area of 
aggressive behavior, some of the conditions 
under which children may relinquish internal 
controls and rely on an external definition of 
transgressions and their 
Common observation of children 
quite clearly that even when their actions are 
not directly known to others, they often 
perceive or anticipate punishment or criticism 
in the responses of other people or in im- 
personal fortuitous events. Children often 
appear uncomfortable until they have con- 
fessed their transgressions or have done some- 
thing that brings about discovery or punish- 
ment. Nor is it unusual to see young children 
reproduce their parents’ disciplinary reactions 
in response to their own actions. Even their 
self-critical or self-corrective behavior may 
depend on external cues and resources for its 
initiation and performance. Another more 
indirect suggestion of children’s dependence 
on the behavioral norms of their social environ- 
ment is apparent in their tendency to localize 
the responsibility for their own misbehavior in 
external sources and to blame others for the 


consequences. 
indicates 


same offenses. 

Externally oriented moral responses cannot 
be regarded as merely transitory reactions 
characteristic of childhood. Various studies 
(Adorno, Frenkel-Brunswik, Levinson, & 
Sanford, 1950; Allinsmith, 1960; Aronfreed, 
1960) have shown that, even beyond childhood, 
people frequently perceive the of 
criticism or punishment for their actions in 
other people’s responses, interpret impersonal 
events as punishment, or blame others rather 
than themselves for wrongdoing. In a recent 
statement based on cross-cultural comparisons, 
Whiting (1959) points out that while no 
society can rely completely on direct external 


source 


supervision of social behavior, not all societies 
inculcate guilt as the effective mechanism of 
moral functioning. He suggests that externally 
defined modes of social control, such as the 
fear of ghosts or sorcerers, are more typical of 
some cultural groups. In our own western 
society, while it may be thought of as one 
that values the more highly internalized kinds 
of controls, one also sees that moral responses 
are often initiated and carried out in a way 
that testifies to the relevance of their external 


social context, 
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It is difficult to conceive that self-evaluation 


is an essential ingredient of those responses 
transgression which show an external orienta. 
tion. Indeed, even the verbalization of self. 
critical statements can sometimes be shown, 
upon questioning an individual, to follow not 
from an articulate standard of judgment, bu 
rather from the tendency to make the critical 
response automatically as though it were a 
necessary consequence of having committed a 
transgression. Such a _ tendency can be 
especially transparent in the behavior of young 
children. We know, too, that a person may 
feel shame, disgust, or generalized anxiety, 
without a clear awareness of precisely what is 
wrong about his behavior. And it frequently 
seems that people do not need to engage ina 
great deal of judgment or conscious reflection 
in order to experience unpleasant feelings, and 
even to carry out self-corrective behavior, in 
response to some of their actions. The minimal 
role that explicit standards appear to play in 
some moral reactions may very possibly reflect 
the limited cognitive auspices under which 
much of socialization takes place. 

For various reasons, naturalistic observation 
seems more appropriate than experimental 
methods as an initial approach to exploring 
and circumscribing a wide range of mora 
responses. However, in order to examine the 
role of self-evaluation in such responses and 
more generally, the extent of their internal or 
external orientation, a systematic assessment 
is required that will distinguish specific forms 
and patterns of response. The empirical survey 
of children’s responses to transgression re- 
ported here is intended to provide a description 
of the variety and distribution of these re 
sponses in a sample of children chosen to 
represent both and the two major 
socioeconomic classes. The survey will then 
be used to evaluate a theoretical framework 
specifying the relationships between various 
kinds of moral response and the child’s pos- 
tions in society in terms of differential patterns 
of social reinforcement. 


sexes 


Social Position and Moral Orientation 


One of the defining criteria of status within 
a socioeconomic hierarchy is that the peop 
holding higher positions have greater powel 


and responsibility to evaluate and determine 
their own behavior as well as to act upon t 
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MorAL RESPONSES 
external environment. The requirements of 
their work tend to encourage independence 
fom immediate external supervision. Their 
wportunities for realizing their aspirations 
through modifying their circumstances would 
make their use of internal resources meaningful 
and rewarding. In contrast, the occupational 
roles held by individuals of lower status tend 
not to encourage initiative or self-reliance. 
[heir opportunities for affecting their environ- 
mentare more restricted. Their work is bounded 
by its concrete material nature, and its per- 
formance is more highly specified, within fairly 
narrow limits, by sources outside of their own 
control. They should, therefore, be more 
prone to perceive external dangers, gratifica- 
tions, and restraints as determinants of their 
behavior. There is considerable evidence that 
differences of social orientation, in the direction 
described, do distinguish between the middle 
and the working classes in our society and 
characterize their values and practices in 
socializing their children (Davis & Havighurst, 
1946; Henry & Short, 1954; Kohn, 1959; 
McClelland, Rindlisbacher, & de Charms, 
1955; Miller & Swanson, 1958). These differ- 
ences would certainly be expected to have 
implications for the child’s internal or external 
orientation in moral behavior. 

If status has the kind of consequences for 
social behavior ascribed to it here, then the 
significance of the masculine and feminine sex 
roles for the child’s moral orientation should 
be, in some respects, parallel to the significance 
of social class membership. Despite the in- 
creasing equality afforded the two sexes in 
western society, it still remains true that 
greater status and self-direction of action 
attach to the masculine role and that children 
are sensitive to the differential activities and 
privileges of the two sexes (Brown, 1958; 
Lynn, 1959; Parsons, 1953; Rabban, 1950). 
Men are expected to have more control over 
their own actions as well as over their external 
environment. Women are expected to be 
tesponsive to direction from without rather 
than to show self-reliance. They must be more 
Sensitive, and ready to accommodate them- 
selves, to external influences (Barry, Bacon, 
& Child, 1957; 1942). A woman’s 


eo renm 
Parsons, 


daily activities in her primary roles of wife and 
: are very much conditioned by her 
dusband’s social status, and they usually do 


mother 
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not permit her as much freedom of movement 
and decision making power as accrue to his 
more crucial occupational role. 

Within the various means by which parents 
may transmit their social values and expecta- 
tions to their children, disciplinary practices 
would appear to be among the most specifically 
relevant to the child’s subsequent moral 
behavior. The prevailing view of social class 
differences in child rearing, that was held for 
some years after the well-known study by 
Davis and Havighurst (1946), was that the 
working class was more permissive and less 
frustrating than the middle class. Recent 
surveys (Bronfenbrenner, 1958; Maccoby, 
Gibbs, et al., 1954) have tended to challenge 
that view and to suggest that inconsistencies 
between different studies of the socialization 
practices of the two classes may be attributable 
to uncontrolled sampling factors or to processes 
of social change over time. It is therefore 
rather striking that in all of the surveys cited, 
as well as in others (Littman, Moore, & 
Pierce-Jones, 1957; Miller, Swanson, et al., 
1960), one persistent difference between the 
two classes continues to be found in their 
disciplinary techniques. Lower-class parents 
are more likely to use direct aggression in 
their discipline, particularly physical punish- 
ment. Middle-class parents are more prone to 
use techniques variously described as “love 
oriented” or “psychological” and consisting 
primarily of withdrawal of love, isolation, 
ignoring the child, reasoning, and explanation. 
On the basis of the little evidence available, it 
does not appear, in contrast, that there are 
consistent differences between boys and girls 
in the methods of punishment which they 
experience, although the survey by Sears et 
al. (1957) suggested that the specific technique 
of withdrawal of love might be used somewhat 
more frequently with girls. 

Many studies (Allinsmith, 1960; Heinecke, 
1953; Sears et al., 1957; Whiting & Child, 
1953) have indicated a limited degree of 
association between parental discipline and 
the child’s moral reactions to transgression. 
The implications of these studies for the 
distinctions made here between specific forms 
of moral response are somewhat difficult to 
assess, since typically a general index of 
conscience or internalization was used. In each 
the use of love oriented or 


case, how ever, 
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psychological methods was positively related 


to reactions to transgression which the in- 
vestigators regarded as evidence of “high 


internalization.” The aim of distinguishing 
moral responses with respect to their internal 
or external orientation suggested that dis- 
ciplinary techniques might be divided, for the 
purposes of the present study, into the two 
broad categories of induction and sensitization 

The induction category subsumes methods 
that should tend to induce in the child reactions 
to his own transgressions which could readily 
become independent of their original external 
stimulus sources. A mother who indicates her 
disapproval by rejecting or ignoring her child, 
or by showing that she is hurt or disappointed, 
is stimulating unpleasant feelings that are not 
closely tied to her physical proximity or its 
imminence. The arousal and termination of 
these feelings are less well detined by externally 
explicit punitive events conveyed through her 
immediate presence than would be the case if 
she were to punish the child with a direct 
attack. Likewise, if she reasons with the child 
or explains why his behavior is unacceptable, 
she is using a verbal and cognitive medium of 
exchange that can provide the child with his 
own resources for evaluating and modifying 
his behavior. Finally, if she asks the child why 
he behaved as he did, insists that he correct 
the consequences of his acts, or refrains from 
punishment when he takes the initiative in 
correcting himself, she is encouraging him to 
examine his actions and accept responsibility 
for them. 

The term sensitization is applied to the more 
direct techniques of physical punishment and 
verbal assaults such as yelling, screaming, or 
“bawling out.’’ Such techniques should give 
greater weight to the mother’s punitive pre- 
sence and be unlikely to induce reactions to 
transgression that could easily become sepa- 
rated from their original external definition. 
They should instead merely sensitize the child 
to the anticipation of painful external events 
and to the importance of external demands 


and expectations in defining appropriate 
responses. 
In view of the significance attributed to 


socioeconomic status and the emphasis on 


self-evaluative resources as one aspect of 


internal orientation, the role that intelligence 
might play in the anticipated relationships 
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was explored by examining the relationship 
between intelligence and each of the specific 
forms of moral response. 


METHOD 


Chere are certain obvious difficulties in trying 1 
kinds of situations around which a 
reactions to transgression may be elicited 
observed. It is unlikely, for exampk 
many of the necessan 


reproduce the 
variety of 
and systematically 
that children 
transgressions in the presence of observers. Moreover 
some forms of moral reaction would not be easily 
discernible in the child’s behavior without resorting t 
techniques of verbal report and accepting the inherent 
limitations of these techniques. To meet these problems 


would commit 


a projective story completion device was constructed 
consisting of five incomplete stories. The advantages 
of such a technique have been demonstrated in previous 
studies (Allinsmith, 1960; Aronfreed, 1960). The story 
completions written by the children can be analyze 
for various responses to transgression occurring in the 
perception and behavior of the central figures, and these 
can be taken to represent the subject’s own strongest 
response tendencies. This projective method tends t 
minimize the child’s awareness that his own reactions 
are the center of interest. It also permits a fairly pre 
cise structuring of a variety of the stimulus conditions 
under which children usually show moral responses to 
their own actions. Some recent investigations (Hokan 
son & Gordon, 1958; Kagan, 1956; Purcell, 1958) have 
suggested that the assumption of isomorphism between 
overt behavior and its projective analogue gains 
credibility when the projective technique is not left 
unstructured, but rather approximates as closely as 
possible the conditions under which the overt behavior 
ordinarily takes place. 

The transgressions that occurred in the stories were 
acts of aggression. Because such a considerable part 
of socialization in our society is devoted to the contro 
of aggression, the range of moral responses elicited in 
this area of behavior was expected to be fairly broad 
and representative of moral responses in general. Each 


story beginning was roughly 100 words in length and } 


related, in simple language, an incident in which a 
child (the central figure) had become very angry with 
little or no justification by ordinary social standards 
The child then committed an act of aggression of a 
kind generally socially prohibited but not uncommon 
among children. The aggression was directed against 
parents, a friendly neighbor, or a close companion, and 
the stories varied in the overtness and directness of the 
aggressive response. In one case, for example, a chil 
is about to join friends at a playground and is angered 
at the slight delay occasioned by the mother’s eee 
i 


to examine a rosebush and to pick any blooms whit 


may have appeared. Rather than take the time t 
return with the roses, the child takes them and throws 
them into a trash can at the playground In another 
story, a child chalks a “nasty” word on a neighbor 
step because the neighbor’s absence prevents the ch 
from having a piece of play equipment that is usua 
generously provided. 
All of the stories attempted to convey that 
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MorAL RESPONSES 


hild’s action was not directly observed by others, so 





externally oriented moral responses occurring in 
child’s own 


he story completions would reflect the 
perspective and not be invited by the context of the 
story. Likewise, provocation of the aggression in each 

the stories was absent or negligible, so that there 
would be no realistic support for holding others re 
sponsible for the actions of the central figure. 

Two sets of stories were used, so that the sex of the 
central figures could correspond to the sex of the child 
completing the stories. The two sets were otherwise 
almost identical, with the exception of minor varia 
tions introduced insure that the stories would be 
sex appropriate. For example, in two stories w here the 
transgression was provoked by the child’s being reason- 
ably refused something, or inadvertently deprived of it, 
the specific nature of the object was made to accord 
with the interests of the two sexes 


to 


Subjects 

The subjects of the study were 122 white children 
drawn from the sixth grade classes of two public 
schools in a large urban school system.? The sample 
was composed of comparable numbers of children from 
and working classes, and was equally 
There were 34 middle- 


the middle 
divided between the two sexes. 


class and 27 working-class boys. Among the girls, 28 
were middle class and 33 were working class. Socio 
economic status was determined on the basis of the 


fathers’ occupations as reported in questionnaire in 
terviews with the children’s mothers. The working 
class group consisted of those children whose fathers 
earned their living through manual labor, factory work, 
or service and maintenance occupations. The fathers 
of the middle-cl children were salesmen, office 
workers, owners and managers, and a few men engaged 
in professions. 


ass 


Procedure 


Story completion technique. The stories were given 
in written form to small groups of five or six children, 
sitting widely separated from one another in a large 
toom. Each child individually read each of the five 
story beginnings and wrote a completion to it, finishing 
agiven story before going an to the next. The task was 
put into a context of creativity by asking the children 
to write “the most interesting kinds of stories for sixth 
graders,” and anonymity was preserved by having 
each child place the completed stories, without any 
hame attached, in a large pile of already completed 
sets. Since each child completed five stories, 610 stories 
in all were obtained from the children 

Questionnaire interview. The mothers of the 
children in the sample represented those who, among 
124 originally solicited, were willing to participate in a 

ionn: lire interview, the purpose of which 
them as “finding out about the different 
things whic h mothers do to help their children grow 
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Sampling biases attributable to the criterion of 
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thus, virtually eliminated. 


participate 


voluntary participation were, 
When the mother’s agreement 
received, she was visited at her home by one of four 
interviewers, two of whom were male and two female 
Though the majority of the interviews were conducted 
by the male interviewers, there was no difference be 
tween the subsamples of mothers seen by the two sets 
of interviewers in their social class or sex of child charac 


to was 


teristics. 

After about 5 minutes of relatively informal ques 
tions aimed at attaining rapport with the mother and 
establishing the nature of her husband’s occupation, 
the interview turned to a series of 12 distinct questions 
designed to elicit the mother’s report of specific dis 
ciplinary techniques used with the child who was the 
subject of the study. Each question described a par- 
ticular form of aggressive behavior and the circum- 
stances under which it occurred, and then asked the 
mother how she usually responded when her child 
behaved in that way. The interviewers were provided 
with standardized probes for additional techniques or 
more detailed information. The situations described 
to the mother included not only obvious misbehavior 
such as throwing food or being verbally offensive 
toward her, but also more subtle incidents involving 
friends or property, where the child’s responsibility 
was somewhat ambiguous. Two of the questions were 
phrased to explore conditions under which the mother 
might withhold or make it contingent 
upon the child’s subsequent actions, but even these 
commonly brought forth further disciplinary methods. 
Some of the questions dealt with the mother’s current 
methods, while others asked about her methods in 
response to the kind of behavior more characteristic 
of the earlier years of childhood. 

Intelligence measure. Scores on the sixth grade form of 
a test of verbal intelligence were available for almost 
all of the children in the sample. This test is a stand- 
ardized instrument used throughout the entire school 
system from which the children were drawn. It has a 
test-retest reliability of .91, a correlation of .78 with 
the Otis Quick-Scoring Inventory, and a correlation of 
.72 with the child’s average standing across all sixth 
grade achievement tests. 


punishment 


Classification of Moral Responses 


The broad outlines of the classification to be de- 
scribed were determined by the theoretical orientation 
of the survey, although some of the details of cate- 
gorization were empirically adapted to accommodate 
the distinct form of typical responses in the story 
Each story was individually scored for 
number of independently defined 
For some types of response, 


completions. 
the 
foun of moral response. 
further distinctions were made among the story condi- 
their pattern of occurrence, in order to 
examine more the extent of their internal or 
external orientation. A single story almost invariably 
showed more than one form of moral response, and was 
therefore entered in more than one subcategory within 
\ given subcategory was 

of a particular 
and the major 


presence of a 


tions defining 
closely 


the system of classification 
used only once, however, in the scoring 


story. The primary subcategories, 
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categories of 





classification under which they were 


organized, were as follows: 


Self-criticism. This category was used whenever 
the moral reactions of the central figures showed 
evidence of self-criticism, self-blame, the explicit 
recognition of wrongdoing, or any form of self- 
appraisal of the kind associated with the experience 
of guilt. A slightly more liberal interpretation was 
made of this category than of others, so as to be 
certain that there would be no underestimation of 
the presence of self-evaluation. Disturbances of 
thought or action, for example, were included when 
they appeared to be associated with reflection upon 
the transgression. Being “sorry”? was not in itself 
taken as an indication of self-criticism, since it 
might actually be a response to external pressures 
or anticipated unpleasant external consequences. 
But remorse was included when it was felt specifi- 
cally for the act of transgression and seemed to imply 
self-evaluation. 

Correction of deviance. A wide range of responses 
was classified here, all of them indicating an attempt 
to return to the appropriate social boundaries of 
behavior. These took the form of admission to 
others of responsibility for the transgression (con- 
fession), expression of remorse for actions or inten 
tions (apology), removal or amelioration of injurious 
consequences (reparation), either concretely or 
through affection or helpfulness toward those per- 
ceived as harmed by the transgression, and resolu- 
tions or commitments to act more acceptably in the 
future (modification of future behavior). 

Degree of activity in self-correction. This index was 
a crude continuum constructed to use the total 
pattern of responses in each story completion. Its 
purpose was to evaluate the extent to which moral 
responses were self-defined and initiated in the 
child’s own behavior in a way that indicated an 
active acceptance of responsibility, or, conversely, 
the extent to which they revealed a passive reliance 
on external events. A story was classified as high 
on the index if the sequence of moral responses in- 
cluded one of the more active forms of self-correc 
tion, such as reparation or modification of future 
behavior, without any dependence on external 
events. Stories were scored as intermediate if the 
sequence of moral responses was self-initiated but 
did not include an active form of self-correction 
(for example, if only confession or apology occurred), 
or if the sequence was externally initiated but sub- 
sequently included an active self-correction not 
imposed or required by external conditions. Stories 
were classified as Jow if they contained only ex- 
ternally defined consequences of transgression, or if 
such consequences were followed only by corrective 
actions that were relatively passive and essentially 
required by the external situation (for example, if 
confession or apology occurred only after the trans 
been discovered and punishment 


gression had 
threatened). 
For all of the other categories in the classification 
of moral responses, social class and sex compari- 
sons could be made on the basis of the subjects’ total 
scores, which were obtained simply by counting the 
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number of instances of a category occurring over the 
five stories. This index, however, was an attempt to 
express the degree to which a certain property of 
moral responses was present rather than its mere 
presence or absence. It was therefore necessary to 
devise a means of combining the evaluations of 
individual stories so as to arrive at a composite 
evaluation of a subject’s entire set of stories. Ac. 
cordingly, values of zero, one, and two were arbj 
trarily assigned to stories scored, respectively, in the 
low, intermediate, and high categories. These values 
were then summated across the five stories to repre 
sent each subject’s total degree of activity in self- 
correction. 

External resolution. This category was used to 
describe responses characterized by the central 
figure’s perception of the consequences of trans- 
gression as being primarily defined by external 
events. Three basic forms were distinguished. In the 
first type, discovery and punishment, other people 
discovered the transgression and responded with 
punishment or criticism. In some cases, discovery 
was invited or punishment provoked by the central 
figure’s actions subsequent to the transgression. The 
second type consisted of experiences of unpleasant 
fortuitous consequences following the transgression 
through indirect effects or impersonal circum 
stances. The central figure might, for example, suffer 
a painful accident, or be left with an unpleasant 
task, or have something of value lost or destroyed 
The third type of external resolution was a focu 
on external responsibility, in which the actions of 
others were held responsible for the transgression or 
were used to justify it 

Externally oriented initiation and performance. A 
set of somewhat more refined distinctions was 
developed to describe various ways in which even 
the central figure’s own moral actions, such as self- 
criticism or corrections of deviance, might never 
theless reveal some degree of reliance on the ex 
ternal social environment. Thus, certain moral 
reactions received an external initiation in that they 
occurred only upon the influence or demands of 
other people. Likewise, public expressions of remorse 
and promises of conformity were considered a 
display of moral responses (often with the result 
of being forgiven). And acts of reparation were 
characterized as dependent on a use of externa 
resources when the assistance of other people (and 
occasionally even of God) was sought in carrying 
them out. 


Classification of Maternal Discipline 


The disciplinary practices reported by the mothers 
were divided into the two broad categories of induction 
and sensitization techniques on the basis of the con 
ceptions already described. Three types of techniques 
were classified as induction: withdrawal of love in the 
form of rejecting or ignoring the child, indicating dis 
appointment, refusing to speak to the child, or telling 
the child that he ought to feel bad; techniques ™ 
fluencing the child to assume or examine his own © 
sponsibility, such as asking him to report or acco! 
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1 his behavior, insisting upon his making reparation, 
+ encouraging him in various ways to define trans 
gressions for himself or to initiate his own moral 
responses; explanation of relevant standards (including 
“seasoning” and “talking things over’’), describing the 
consequences of his actions to the child, suggesting 
appropriate alternative actions, or simply telling the 
hild what aspects of his behavior were unacceptable. 
All forms of physical punishment and uncontrolled 
verbal assaults such as yelling, shouting, screaming, or 
“bawling out” were classified as sensitization tech- 
niques. 

Other techniques used by the mothers which could 
not be clearly coordinated to the behavioral impact 
ascribed to induction and sensitization methods were 
excluded from the classification. Among the most 
frequent of these were various forms of restriction and 
leprivation, the effects of which were considered to be 
highly dependent on their behavioral context and style 
of administration. Discipline described with such terms 
as “scolding,” “lecturing,”’ or “criticism,” without 
further elaboration, was also viewed as ambiguous 
within the induction-sensitization framework, since it 
was not obvious that it provided self-evaluative re- 
sources for the child nor could it, on the other hand, be 
assumed to represent a direct attack with the function 
of merely eliminating the offending behavior. 

For each mother, two separate counts were made 
across the 12 disciplinary situations used in the ques- 
tionnaire—one for the number of situations in which 
she reported induction techniques and the second for 
the number in which she reported sensitization tech- 
niques. In some instances, of course, the mother’s 
response toa single situation contributed to both counts. 
Each situation was permitted to enter only once into 
each count, however, even though mothers often men- 
tioned more than one specific kind of induction or 
sensitization technique in a given situation. The entire 
sample of mothers was then divided into two groups on 
the basis of which of the two counts was higher. Three 
mothers whose induction and sensitization counts were 
identical were placed in the group showing a predomi- 
nance of sensitization techniques in order to have the 
two groups as nearly equal in size as possible. 


Reliability of Classification 


The entire classification of moral responses was 
independently applied by two judges,? one of whom 
was the writer, to the 100 stories written by a sub- 
sample of 20 children. The children were selected so as 
to equally represent both sexes and the two social 
classes, but were randomly chosen within this restric- 
tion. Each specific subcategory of moral responses 
scored by either of the judges was treated as an instance 
of classification, and was entered as a disagreement if 
hot scored by both. In the case of the index of degree of 
activity in self-correction, failure of the two judges to 
Score the same degree of activity was counted as a dis 
agreement. Percentages of agreement were computed 
over the total number of instances of classification in 


The author is indebted to Stephanie Manning for 
work in establishing the reliability of classification. 
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TABLE 1 
PERCENTAGE OF AGREEMENT BETWEEN Two INDE- 
PENDENT JupGES FOR Eacn Major CATEGORY 
OF CLASSIFICATION OF MORAL RESPONSES 


| Percentage 


Category of moral responses of agree- 
ment" 
Self-criticism 04 
Correction of deviance | 96 
Degree of activity in self-correction 90 
External resolution 97 
Externally oriented initiaticn and _ per 
formance 98 


* Based on a sample of 100 stories completed by 20 subjects. 
each of the five major categories of the system. These 
percentages of agreement, summarized in Table 1, 
indicate that the classification system can be objectively 
and consistently used. The rather high reliability of 
classification probably reflects the fact that the system 
was specified in great detail and left little room for 
inference or interpretation. Comparable reliability 
has been obtained with a similar system in a previous 
study (Aronfreed, 1960). 

The same two judges independently classified the 
disciplinary techniques reported by a randomly se- 
lected subsample of 20 mothers. Although the tech- 
niques were eventually to be grouped into the two 
broad types of induction and sensitization, reliability 
was based on the identification of specific forms of 
discipline, and a disagreement was entered whenever a 
particular form was scored by one judge but not by 
the other. The judges agreed on 97% of the total 
number of instances of classification of induction 
techniques and on 98% of the instances of sensitization 
techniques. 


RESULTS AND DiscussION 

Table 2 shows, for each of the subcategories 
in the classification of moral responses and 
for the major types into which they were 
grouped, the frequencies and percentages of 
occurrence among both subjects and stories. 
In order to assess the role of self-evaluation 
in correction of deviance and in external 
resolution, instances of each specific form of 
these two types of response were separately 
tabulated with respect to the presence or 
absence of self-criticism in the stories in which 
they occurred. In computing percentages, a 
common basis for comparing different forms 
and patterns of moral response was provided 
by taking all frequencies as proportions either 
of the total number of subjects or the total 
number of stories. 

The entries given for the broad response 
types in Table 2 represent frequencies and 
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percentages of occurrence of these types in 
any specitic form. A given subject, and even a 
single story, frequently showed multiple 
instances of moral responses classified within 
the same major type. Consequently, the 
entries presented for any major type are 
always smaller, and often considerably so, 
than the sum that would be obtained by 
adding the totals recorded for each of its 
specific forms. The single exception to this 
pattern occurs, in the case of the stories only, 
for the index of degree of activity in self- 
correction, since there a rating of degree is 
given to every story and is applied to the whole 
story as a unit. Similarly, for those forms of 
moral response which are separately tabulated 
according to whether or not they are accom- 
panied by self-criticism, the entries for subjects 
showing any instance of a response are smaller 
than the combined entries for the presence and 
absence of self-criticism, since a particular 
subject may show instances of the 
response that are accompanied by self-criticism 
and others that are not. The entries for stories 
showing any instance of a response do, how- 
ever, represent the combined entries for the 
presence and absence of self-criticism, since in 
a single story self-criticism can only be either 


some 


present or absent. 

The most immediately obvious feature of 
the distribution of moral responses is their 
great number and variety. Even restricting 
our attention to self-criticism, correction of 
deviance, and external resolution, and exclud- 
ing for the moment the more refined patterning 
of moral responses, we see that each trans- 
gression must be followed by multiple, 
sequential reactions. It is apparent that, 
among the perceptions and behaviors that 
constituted moral responses, a single form of 
response was generally not sufficient to define 
the consequences of transgression. 
Self-Criticism 

Over one-fourth of the children show no 
evidence at all of self-criticism. The 90 sub- 
jects who do show evidence of self-criticism, 
while they are clearly in the majority, do not 
use it as a recurrently characteristic response, 
as can be seen from the fact it occurs in only 
28% of the stories. The restricted role of self- 
criticism is also reflected in the limited extent 
to which it accompanies corrections of deviance 
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TABLE 2 
FREQUENCY AND PERCENTAGE OF SUBJECTS 4yp 
Srorres SHowrnc Eacu CATEGORY of 
CLASSIFICATION OF MORAL RESPONSES 
Subjects Stories 
Category of moral responses . — — 
| N a n | os 
Self-criticism 90 73.8 171 | 20 
Correction of deviance (any form) 
Self-criticism present 84 | 68.8 160 | 26.2 
Self-criticism absent 113 | 92.6 | 325 53.3 
Any instance 117 | 95.9 | 485 | 79 
Confession 
Self-criticism present 58 | 47.5) 1 | 149 
Self-criticism absent 94 | 77.1 | 182 | 98 


Any instance 108 | 88.5 | 273 | 447 


Apology 


Self-criticism present 18 | 14.7] 18] 29 
Self-criticism absent 46 37.7 55 9.0 
Any instance 54 | 44.3 73 | 11 
Reparation 
Self-criticism present 56 | 46.2 79 | 12.9 
Self-criticism absent 99 | 81.1 | 165 | 27.1 
Any instance 110 | 90.2 | 244 | 40.0 
Modification of future behavior | 
Self-criticism present 38 | 31.1 $3 | 8 
Self-criticism absent 56 | 46.2 89 | 146 
Any instance 71 | 58.2 | 142 | 23.3 
Degree of activity in self-correction 
Low 105 | 86.1 212 | 4.8 
Intermediate 116 | 95.1 | 248 | 40.7 
High 89 72.9 | 150 | 24.5 
External resolution (any form) 
Self-criticism present 79 | 64.8) 125 | 0 
Self-criticism absent 96 | 78.7 | 168 | 27.5 
Any instance 118 | 96.7 | 293 | 48.0 
Discovery and punishment 
Self-criticism present 58 | 47.5 70 | 11.5 
Self-criticism absent 84 | 68.8 | 122 | 20.0 
Any instance 107 | 87.7 92) 3.5 


Unpleasant fortuitous consequences 
Self-criticism present 34 | 27.9 % 
Self-criticism absent 
Any instance 

Focus on external responsibility | | 


Self-criticism present | 43 | 35.2 $2 | 85 

Self-criticism absent 47 | 38.5 $2 | 8 

Any instance 75 | 61.5 | 104 | 17.0 

Externally oriented initiation and 

performance (any form 93 | 76.2 | 150 | 24.6 
External initiation SS | 45.1 7S | 12.3 
Display Si | 41.8 75 | 12.3 
Use of external resources 43 | 35.2) 4/7 


® Percentages based on Ns of 122 subjects and 610 stories. 


and external resolution. For each of the 
specific forms of correction of deviance, and in 
the entries for instances of any form, the 
percentage of subjects for whom there is a0 
accompanying self-criticism is uniformly con- 
siderably smaller than the percentage o 
subjects for whom there is not. Among the 
stories, also, self-criticism is typically absent 
roughly twice as frequently as it is presen 
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shen corrections of deviance occur, and the 
disproportion is even greater in the case of 
apology. It would certainly not seem that any 
¥ the different kinds of correction of deviance 
sresuppose the presence of self-evaluation. 
Corrections of deviance should not, therefore, 
benecessarily taken as indicating the operation 
of cognitive resources of moral judgment. 
They should instead by viewed as ambiguous 
in their implications for moral orientation. 
{pparently, they can be used rather mechani- 
ally, either because they have become instru- 
mental in reducing what are simply unpleasant 
feelings or perhaps because they have been 
secessiul in avoiding anticipated external 
punishment. 

The tendency for self-criticism to be absent 
more often than present is less consistent as a 
context for the various forms of external 
resolution. In the case of each form, however, 
it is absent at least as often as, if not more 
often than, it is present. The substantial 
extent to which each of the different forms of 
external resolution is accompanied by self- 
criticism is a finding of special interest. There 
have been some attempts (Allinsmith, 1960; 
Aronfreed, 1960; Flugel, 1947) to conceptualize 
moral responses that are highly externally 
defined in such a way that self-evaluation is 
still a condition of their occurrence. The 
concept of “defense against guilt,” for example, 
specifies that “externalized” moral responses 
defend the person against the painful experi- 
ence of self-criticism which is presumed to be 
always present following a transgression, but 
may somehow not be given conscious expres- 
sion. The patterning of moral responses does 
not, however, support the view that the 
different forms of external resolution are 
merely secondary elaborations whose function 


is to avoid self-criticism. It would appear, | 
, tather, that self-criticism and external resolu- 


tion are parallel and complementary conse- 
quences of transgression, and that neither one 
precludes or rests upon the presence of the 
other. There is no reason to assume that all 
moral responses are to be interpreted as trans- 
lormations of the experience of guilt. 
Corrections of Deviance 

Corrections of deviance are clearly indis- 
pensable to almost all of the children and are 
more characteristic of their behavior than any 
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other type of moral response. They occur in 
almost 80% of the stories. Confession and 
reparation are decidedly the most common 
forms. Either of these kinds of moral reaction 
would, in itself, come close to accounting for 
the number of subjects who show correction 
of deviance in any form. It is also interesting 
to note that over 40% of the children give no 
indication at all of the modification of future 
behavior following a transgression. 

The various levels of degree of activity in 
self-correction are each manifested by very 
large proportions of the total group of children 
and are fairly well distributed throughout the 
stories. There discernible, nevertheless, 
some tendency for moral responses showing a 
self-defined and active acceptance of responsi- 
bility to be less typical of the sample as a 
whole. Such responses occur in only about 
one-fourth of the stories and do not appear at 
all for over 25% of the subjects. The entire 
pattern of findings on this index points to the 
heavy situational determination of corrections 
of deviance and to their very uncertain value 
as indicators of moral autonomy. 


is 


Externally Defined Responses 


External resolution, like correction of 
deviance, is a component of moral responses 
for virtually every child. Its various forms, 
while not as pervasive as corrections of 
deviance, are present in almost half of the 
stories. Discovery of the transgression and 
punishment by others is more common than 
unpleasant fortuitous consequences or a focus 
on external responsibility, but all of the forms 
are substantially present. What cannot be 
conveyed in the formal presentation of re- 
sponses is the frequency with which external 
resolution is brought about through the most 
devious and artificial means, particularly in 
view of the fact that the story beginnings make 
it explicit that only the central figure knows of 
the transgression. Just as in actual life situa- 
tions, however, there generally remains in the 
stories some minimal possibility of external 
consequences. This possibility is commonly 
seized upon and exploited in those instances 
where external resolution occurs. Apparently, 
this kind of reaction is sometimes an indis- 
pensable part of the child’s equipment for 
responding to a transgression. 





232 


The importance of external events in 
defining the consequences of transgression is 
further attested by the extent to which even 
moral actions carried out by the central figure, 
such corrections of deviance, show an 
external orientation. The initiation of moral 
responses through the influence or demands of 
other people, the public display of some of 
these responses, and the use of external 
resources in carrying out reparation each 
occur in the stories of a sizable percentage of 
subjects. Over three-fourths of the subjects 
show evidence of some form of reliance on an 
external social context in the initiation and 
performance of moral actions. 


as 


Nature of Reactions to Transgression 

The findings of the survey indicate that 
self-evaluation of the kind implied in the 
concept of guilt is not a prerequisite of interna- 
lized responses to transgression and that such 
responses frequently take a variety of forms in 
which the consequences of transgression are 
perceived as externally defined. Such a con- 
clusion does not deny that cognitive resources 
of moral judgment are an important aspect of 
morality and have special implications for 
social behavior. But the use of such resources, 
however salient it may seem in the behavior of 
some individuals, would appear to constitute 
only a special case of the generalized dimen- 
sions needed to describe moral responses. The 
intimate dependence of so many moral re- 
actions on external events is, in some sense, 
just what one might expect. It would be 
strange, indeed, if moral responses assumed 
properties entirely divorced from the external 
socializing situations in which they were 
originally learned. 

If self-criticism is not to be given a funda- 
mental status in moral behavior, but is instead 
to be viewed as only one of a set of learned 
moral reactions, we are left with the problem 
of how to conceptualize the basic nature of 
internalized responses to transgression. To 
resolve the problem in a way that takes account 
of the variety and patterning of these re- 
sponses, we might specify that a behavior be 
defined as a transgression for an individual to 
the extent that it has been followed, in his 
experience, by negative (punitive) external 
sanctions. Accordingly, the negative affect 
that attaches to the behavior through social 
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punishment would be considered, when 

became independent of such punishment, the 
primary and invariant component of internalj- 
zation.‘ This state of negative affect may by 


regarded as endowed with motivating proper. } 


ties and as being reducible through a number 
of different kinds of responses. The responses 
acquire their instrumental value because they 
reproduce significant in the original 
socializing situations. These cues are, in fact 
aspects of the behavior of the child himself or 
of the socializing agents which were frequently 
associated with the termination, avoidance, or 


cues 





minimizing of punishment. The child’s criti. | 


cism of his own actions or feelings is one 
example of such an instrumental response. Its 
use is probably determined by the extent te 
which the patterns of social reinforcement that 
he has experienced around transgressions have 
been characterized by the explicit verbalization 
of standards and their application to his 
behavior. When a child’s punishment incor- 
porates verbal criticism, the criticism itself 
will frequently come to signify the termination 
of the painful feelings that accompany its 
anticipation. The child can then later himself 


terminate such feelings by reproducing the | 


criticism in his own behavior.® The same kind 
of instrumental function can be ascribed to 
corrections of deviance and to responses in 
which punishment is perceived in the actions of 


‘It has already been asserted that this affect should 
not in itself be referred to as guilt, since it is not neces- 
sarily accompanied by the cognitive phenomena 
usually associated with that term. To refer to the 
affect as anxiety seems more warranted, though such 
a general characterization possibly obscures a variety 
of qualities of experience dependent on the specific 
nature of the punishment from which they derive. 

The term punishment is not used here in any fe- 
stricted sense, but rather to convey the entire range of 
reinforcements that may induce painful feelings in 3 
person. In this broad sense, the application of negative 
sanctions may take place through the most subtle and 
indirect kinds of interaction between people (as when, 
for example, a mother unintentionally withdraws from 
a close physical contact with her child because of some 
thing that he is doing). 

§ Sears, Maccoby, and Levin (1957), 
and Child (1953), provide an interesting contrast te 
this interpretation of the role of self-criticism. They 
suggest that the child reproduces the critical applica 
tion of its parents’ standards to its own behavior, not 
because the criticism is associated with the terminatiot 
of negative affect, but rather because there is a ge 
eralized positive afiect attached to parental presenct 


and actions 
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TABLE 3 
Freqvency BY SocraL CLass AND SEX OF CHILDREN 
ScorrnG Hicu 1n Eacu CATEGORY OF 
CLASSIFICATION OF MoraAL RESPONSES 


Working 


class 


Middle class 
Category of moral responses 


Boys | Girls | Boys | Girls 
3 27 | 33 


N= 34 28 
— | — a 

alee 

Self-criticism 16 | 18 6 | 13 
Confession 15 19 10 il 
Apology | 16 oo) 18 
Reparation | 18 20 14 16 
Modification of future behavior | 19 18 15 22 
Degree of activity in self-correction | 19 16 | 13 17 
Discovery and punishment 13 9; 9 20 
Unpleasant fortuitous consequences | 13 7 | 18 | 24 
Focus on external responsibility 12 17 is | 20 
External initiation 9 |} 15 9 23 
Display 13 


18 7 | $8 
10 | 14 10 11 


Use of external resources 








Note.—A subject's score on any category is the number of in- 
stances of that category occurring across the five story comple- 
tions, where each category may be counted only once in a given 
story. The single exception is in the case of degree of activity in 
self-correction, where a subject’s score is the summation of values 
assigned to the total pattern of moral responses in each of the 
ive stories (see Method section). 


other people or in impersonal fortuitous events. 
All of these functionally equivalent responses 
might be referred to as moral consequences in 
order to emphasize that externally oriented 
responses, and those in which moral judgment 
is not apparent, have important attributes in 
common with responses which clearly exhibit 
self-evaluation. 


Social Patierning of Moral Consequences 


Table 3 shows the frequencies with which 
children in each of the four subgroups defined 
by social class and sex received high scores on 
the various subcategories in the classification 
of moral responses.* High and low scores were 
determined, independently for each sub- 
category, by splitting the range of scores for 
the subcategory into the two segments that 
would most nearly equally divide the entire 
sample of 122 subjects. Chi square values for 

* Social class and sex comparisons are not shown for 
the cumulative frequencies in the broader response 
types (major category headings). Significant variations 
in these frequencies could easily be generated by the 
response distributions in certain specific subcategories. 
Thus, they could not be interpreted as providing any 
information beyond that already conveyed in the sub- 
categories, and would actually tend to inflate the 
apparent number of significant differences. 
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each of the possible social class and sex com- 
parisons are presented in Table 4. 

Inspection of Tables 3 and 4 reveals a series 
of significant differences which are, almost 
without exception, those anticipated. The 
most consistent differences are found among 
the aspects of moral consequences that are the 
most unequivocal indicators of moral orienta- 
tion. Thus, middle-class children, regardless of 
their sex, show more evidence of self-criticism 
than do working-class children and are con- 
siderably less likely to resolve transgressions 
through the perception of unpleasant fortuitous 
consequences or a focus on external responsi- 
bility. Similarly, the boys, irrespective of their 
socioeconomic status, do not emphasize 
external responsibility as frequently as do the 
girls and appear much less dependent on an 
external initiation of their own moral actions. 
The tendency of girls to publicly display their 
moral reactions more often than boys is also 
fairly uniform, though it does not attain 
statistical significance in the working class 
group. 

The appearance of consistent social class 
differences across both sexes, and of sex 
differences not confined to a particular class 
level, suggests that each of these two dimen- 
sions of social position has a general and 
independent significance for the child’s moral 
behavior. At the same time, the marked 
parallelism of the moral orientations associated 
with the two kinds of status distinction 
supports the view that their similar effects are 
mediated through common elements of social 
reinforcement defining the extent of oppor- 
tunity to evaluate and determine one’s own 
actions or, conversely, the extent of one’s 
dependence on the external environment. The 
common effects are especially striking when 
both are exhibited in the same form of moral 
response, as happens in the tendency to focus 
on external responsibility. 

It can be seen that the parallel variations in 
moral orientation associated with social class 
and sex role are not identical in the specific 
forms that they assume. Social class differences 
appear to center on the distinction between 
moral consequences defined primarily in terms 
of the child’s own actions and those defined 
primarily in terms of external events. Sex 
differences, in contrast, seem to reflect more 
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TABLE 4 
Cut SQUARE VALUES FOR SocraL CLass AND SEX COMPARISONS OF CHILDREN SCORING HIGH 1n 


Middle class vs. Working class 


234 
Category of moral responses 

Boys 
Self-criticism 3.02* 
Confession 0.09 
Apology 9.01 
Reparation 0.03 
Modification of future behavior 0.06 
Degree of activity in self-correction 0.12 
Discovery and punishment 0.02 
Unpleasant fortuitous consequences 3.80* 
Focus on external responsibility 4.74* 
External initiation 0.09 
Display 0.55 
Use of external resources 0.13 


*~ < .05, one-tailed test. 
** >» < 01, one-tailed test. 


of the variability in the extent to which moral 
consequences occurring in the child’s own 
actions are nevertheless dependent on the 
support of the external environment. It seems 
hardly surprising that the particular forms of 
moral response that vary with the two indices 
of status are not the same. Social class and 
sex role carry great differences in 
definition as well as the relevant similarities 
pointed out here. These differences would 
undoubtedly influence the specific way in 
which a child’s moral orientation was ex- 
pressed. It should also be noted that many of 
the social class and sex differences are not 
extreme even when they are statistically 
significant. The moderate degrees of difference 
may simply reflect the fact that the social 
experiences which affect the child’s moral 
orientation are not, in our society, completely 
distinct with respect to variations in either 
socioeconomic status or sex role. 

The least evidence of significant effects of 
social position occurs among the various forms 
of correction of deviance. The absence of the 
marked associations with status found among 
other aspects of moral consequences accords 
with the view that corrections of deviance are, 
in themselves, ambiguous with respect to 
moral orientation. Although they represent 
the individual’s own actions in defining the 
consequences of transgression, they are fre- 
quently highly influenced by external events. 


social 


Eacu CATEGORY OF CLASSIFICATION OF MORAL RESPONSES 


Note.—Chi square values for 2 X 2 contingency tables (employing correction for continuity) based on frequencies in Table 3. 


——— 


Boys vs. Girls 


——— — . 





Girls Total -—y bap | Total 
2.83* 5.73" 1.21 1.31 2.14 
5.91** 4.08* 2.60 0.00 0.53 
0.00 | 0.03 2.14 1.91 4.77* 
2.42 1.15 0.14 0.00 0.33 
0.01 | 0.00 0.17 | 0.38 0.86 
0.03 0.28 0.02 / 0.00 | 0.00 
3.85* | 1.57 0.05 |} 3.40* | 1.21 
aw | &.o" 0.70 | 0.05 0.03 
4.65* 12.93** 3.03* 2.79* | 7.85% 
1.06 2.07 3.68* | 6.50** | 11.92** 
0.14 1.7 3.19* 1.67 4.80* 
1.12 0.06 1 00 0.56 


.94 | O. 





In fact, a central distinction between the 
moral actions of boys and girls, as has already 
been pointed out, lies in the extent to which 
they are externally initiated and displayed to 
others. 

The social class differences include, in addi- 
tion to those consistent across both sexes, 
some that are restricted only to girls. Middle- 
class girls show more evidence of confession 
and working-class girls more frequently 
perceive discovery and punishment by others 
as defining the consequences of their actions. 
The entire pattern of differences between the 
two classes provides an instructive contrast 
with the findings of two other surveys of 
children’s reactions to transgressions that 
failed to reveal significant social class varia- 
tions. One of these (Allinsmith, 1960) used a 
story completion device, similar to the one 
employed in the present study, with junior 
high school children. Many specific reactions, 
including self-criticism, corrections of deviance, 
and those characterized here external 
resolution, were combined to derive a single 
broad evaluation of “intensity of guilt.” In the 
other survey (Sears et al., 1957), a general 
rating of the extent of development of con- 
science in kindergarten children was based on 
mothers’ reports of a wide variety of overt 
behavioral manifestations of guilt. It would 
seem that the use of a concept like internal vs. 
external orientation to treat different kinds of 
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moral response as distinct forms of behavior 
introduces a meaningful social patterning into 
their variety. Such a patterning is not apparent 
when all internalized responses to transgression 
are presumed to be equivalent reflections of 
some underlying unitary phenomenon such as 
guilt or conscience. 

' That girls rely more than boys on an ex- 
ternal definition of moral consequences is a 
finding of considerable interest since a seem- 
ingly contradictory assertion is commonly 
made that girls, in our society, show greater 
conscientiousness and moral sensitivity. Per- 
haps it should be emphasized that the sex 
differences in moral reactions reported here 
represent variations, not in the quality or 
breadth of morality, but rather in the pattern- 
ing of specific forms of response all of which 
are defined as internalized in that they occur 
in the absence of any explicit external sanc- 
tions. The fact that the external orientation of 
girls is localized to a considerable extent in 
their use of external resources to support their 
own moral actions may be germane to resolv- 
ing an apparent discrepancy with the findings 
of one other study reporting sex differences in 
responses to transgression. 

Sears et al. (1957) reported that, among 
the roughly 20% of their 5-6 year old children 
showing evidence of “highly developed con- 
science,” there was a slightly but significantly 
higher percentage of girls. It seems quite 
possible, however, that their use of the 
mother’s observations to assess the child’s 
moral reactions tended to put a premium on 
the kinds of response that would be easily 
discernible to the mother and perhaps even 
oriented toward her approval. Expressions of 
guilt were, in fact, defined in their study so 
as to include not only overt self-criticism and 
attempts at restitution, but also other be- 
havioral manifestations that seem more 
externally oriented (for example, asking for 
forgiveness, looking “sheepish,” seeking 
punishment, or acting in such a way as to 
invite inquiry or discovery). It is also note- 
worthy that one of their most heavily used 
indices of conscience was confession. Responses 
such as confession and apology may be re- 
garded as somewhat externally oriented 


corrections of deviance since, even though they 
represent the individual’s own moral actions, 
they do require the presence of an external 
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figure. The response distributions found in the 
present study indicate, for example, that the 
more frequent use of confession among middle- 
class children than among _ working-class 
children is restricted to girls. Indeed, the 
tendency of the girls in the middle class to 
employ confession more often than the boys 
also comes close to attaining statistical 
significance. In the combined responses for 
both social classes, girls likewise show signifi- 
cantly more evidence of apology than do boys. 
If the moral responses of boys are somewhat 
less manifest to mothers than those of girls, 
the apparent difference might conceivably 
indicate the boys’ greater independence of the 
mother as an external reinforcer rather than 
an absence of internalized reactions to trans- 
gression. 

From the point of view of parents and other 
socializing agents, the visibility of moral re- 
sponses may be considerably enhanced by the 
child’s use of overt self-criticism and appropri- 
ate verbalization of standards. These particu- 
lar forms of response might in turn be con- 
ditioned by differences in the quality and rate 
of development of verbal facility. The fact 
that girls may show in their moral behavior 
an observable conscientiousness that easily 
flows into verbal expression is certainly one 
kind of evidence of internalization. The find- 
ings of the present study suggest, however, 
that morality is not monolithic. The con- 
scientiousness of girls need not obviate the 
possibility that their moral responses are also 
dependent on their external situation in a 
way that conforms to the more general charac- 
teristics of their feminine role. Terman and 
his co-workers (1925) found, for example, in 
their studies of gifted and control children, 
that boys did not score as well as girls on five 
of seven character tests based essentially on 
verbalized knowledge of moral standards. 
But the girls were somewhat poorer on actual 
performance tests of honesty (when it appeared 
that their actions would not be known to 
others). Similarly, Hartshorne and May (1928) 
observed that boys were more resistant to 
temptation than girls on performance tests of 
“good” and “bad” behavior and were more 
consistent in their actions from one test to 
another. They concluded that differences in 
moral reputation favorable to girls were 
discrepant with differences in performance. 
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Perhaps Terman and Tyler (1954) are cor- 
rect in asserting, in their summary of sex 
differences, that the moral opinions expressed 
by girls result in their being inaccurately 
credited with superior moral qualities in their 
behavior. 

Finally, it should be pointed out that the 
nature of the between social 
positions and moral responses does not easily 
lend itself to the view, suggested by Piaget’s 
(1948) interpretation of moral development, 
that different types of moral orientation se- 
quentially emerge with advancing age or 
experience. It would seem more appropriate 
that internal and external orientation in moral 
behavior be understood as relatively stable 
end-results of different patterns of social 
reinforcement. Such an understanding would 
not be inconsistent with other recent studies 
(Boehm, 1957; Durkin, 1959; Kohlberg, 1958; 
MacRae, 1954; Peel, 1959) in which, as in 
Piaget’s investigations, attention was de- 
voted primarily to moral judgment. These 
studies, when taken together as a group, sug- 
gest that the child’s moral orientation, even 
as reflected only in cognitive resources, is 
more closely associated with social status and 
other indicators of cultural expectation than 
it is with age. 


associations 


Intelligence 

In order to answer the question of whether 
intelligence could be a factor in accounting 
for either the social class or sex differences, 
the total sample was divided into two equal 
groups on the basis of the verbal intelligence 
test on which IQ scores for all but three of the 
children were available. One group consisted 
of children whose scores were less than 110. 
In the other group were children whose scores 
were 110 or greater. These two groups were 
compared with respect to their frequencies of 
high and low scores on each category in the 
classification of moral responses. The compari- 
sons indicated no relationship whatsoever be- 
tween verbal IQ and any of the indices of moral 


consequences. 


Maternal Discipline 


Table 5 shows, for children in each of the 
four groups defined by social class and sex, 
the frequencies with which the disciplinary 
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techniques predominantly used by their 
mothers fall into the induction or sensitiza- 
tion categories. The comparisons in Table 6 
indicate that, with children of both sexes. 
middle-class mothers use more induction 
techniques and lower-class mothers use more 
sensitization techniques. There are no signifi- 
cant differences in the types of discipline used 
with the two sexes. The social class differences 
are in agreement with the findings of other 
studies in which techniques respectively 








subsumed here under induction and sensitiza- 
tion were related in the same direction to the 
mother’s socioeconomic status. 

The frequencies of high and low scores on 
three indices of moral consequences, among 
children whose mothers use predominantly 
induction or sensitization techniques, are ' 
presented in Table 7. Only those categories 
of moral consequences are shown for which the 
associations with maternal discipline are statis- 
tically significant when taken over the entire 
sample. There is only one instance in which a 


TABLE 5 
FREQUENCY OF PREDOMINANT UsE OF INDUCTION OR 
SENSITIZATION TECHNIQUES BY MOTHERS oF Cut- 
DREN IN Eacu SOCIAL Crass AND Sex Group 


Middle class 





: Working class 
Predominant maternal 
discipline 





= 
| Girls | 




















Boys Boys | Girls 
Induction | 21 | 20 9 | 14 
Sensitization | 13 | 7 | 18 | 





not available for two girls. 


TABLE 6 
Cur SQUARE VALUES FOR SoOcIAL CLASS AND SEX OF 
CuILp COMPARISONS OF PREDOMINANT 
MATERNAL DISCIPLINE } 
_Gadaction vs. Sensitization) 


18 
Note.—Total N = 120. Adequate interview data on discipline 
| 
| 











Comparison 











Middle class vs. W ide class | 


Boys 3.80* 
Girls 4.34* 
Both sexes 8.50** 
Boys vs. Girls 
Middle class 0.55 
Working class 0.30 
0.55 


Both classes 





Note.—C hi sc square adie for 2 X 2 contingency tables (employ 
ing correction for continuity) based on frequencies in Table S. 

* p< .05, one-tailed test. 

** > < .01, one-tailed test 
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their TABLE 7 
*,s COMPARISON OF SCORES ON CATEGORIES OF MORAL RESPONSES OF CHILDREN WHOSE MOTHERS UsE 
itiza- PREDOMINANTLY EACH OF THE Two TYPES OF MATERNAL DISCIPLINE 
dle 6 (Induction vs. Sensitization) 
CXES, (I = Induction; S = Sensitization) 
ction = neeieiae ey | ‘ eee 
more Middle class Working class Boys | Girls | Total sample 
Category of moral responses ‘nineties —_ Paes: — , a — 
pnifi- r}/si@iris/s#iry};si|miris|#mirisis 
used —_ ee ae Se S = ae S Ine 
snces Reparation ? | ms | ms ill ‘Doe 
other High 30| 8 | 14 16 | 19/13) | 25/11 (44| 24 
ively Low 11 | 12 9 | 20} 11 | 18 9 14 20 | 32 
Weed Degree of activity in self-correction ” ”- - : ” 
jtiza- High 27 | 7 17 | 13 21} 11 23; 9 | 44 | 20 
h Low 14 | 13 6 | 23 9 | 20 ; 11 | 16 20 | 36 
) the e e * * * | ** | ** 
Unpleasant fortuitous consequences | 
High 10 | 10 12 | 29 | 11 | 20 mia 22 | 39 
S on Low 31 | 10 11) 7/ | 19/11] | 23] 6 | 42 | 17 
nong : PS i RRS Oe ET RES EE ee OE) etmiieaicaiias sik, 
tl} ‘ Note.—Only those categories of moral responses are shown for which the association with maternal discipline was statistically sig- 
inuy nificant when taken over the total sample (see Results and Discussion). 
are ) * » values for chi square tests (employing correction for continuity). 
ories *~ < .05, one tailed test. 
**~ < .01, one-tailed test. 
1 the 
atis- oe — oe , : iis 
rs significant variation limited to a single sub- response related to social position do not vary 
ntire ee ae : : . err “he 
ry group is eliminated from presentation by this with maternal discipline, may indicate that 
| criterion—girls whose mothers use predomi- the cataloguing of specific methods of punish- 
nantly sensitization techniques are more ment captures only a very partial segment of 
IN OR likely to focus on external responsibility than the total pattern of social reinforcement bear- 
HIL- girls whose mothers use induction techniques. ing upon the child’s moral orientation. The 
— The association of the children’s moral re- crucial elements of the pattern may conceiv- 
is sponses with maternal discipline, while less ably lie, for example, in the nature of certain 
adele striking and extensive than their association response cues in the behavior of either the 
sirls with social position, is generally consistent child or the socializing agent, or in the place 
and in the expected direction. The response of of these cues in the sequence of onset and 
a reparation and a high degree of self-initiated termination of punishment. An understanding 
acceptance of responsibility are more charac- of how these elements of social reinforcement 
ipline teristic of children whose mothers use induc- affect the child’s moral responses may require 
tion techniques. Children whose mothers use_ discriminations that are beyond the parent’s 
sensitization techniques are more likely to capacity as a retrospective observer and 
x OF perceive external consequences of transgres- demand the more exact control that is possible 
sion in the form of unpleasant fortuitous with experimental methods. 
} events, R C 
1. . . . . , f AN N J N 
— The patterning of moral orientation with UMMARY AND CONCLUSIONS 
alue respect to maternal discipline tends to be Contemporary treatments of moral develop- 
— consistent across the various social class and ment have characteristically given a central 
sex subgroups, thus indicating that it cannot role to moral judgment and have taken the 
be regarded as merely coincidental to the view that a moral response is one that is 
: relationships between social position and moral based on the individual’s evaluation of be- 
response. Actually, neither the use of repara- havior in terms of his own explicit standards 
tion nor the degree of activity in correction of and that no longer requires the support of 
deviance is directly related to the social class external norms and sanctions. A variety of 
— or sex of the child, despite the significant research and observation, however, suggests 
loy “er ° 7 ae ° 
“4 association between social class and discipline. that responses to transgression, even when 


These findings, when considered together with internalized, often reveal an absence of self- 
the fact that many of the aspects of moral evaluation and an external orientation incon- 
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sistent with what one might expect if they 
were entirely internally mediated. The purpose 
of the investigation reported in this paper is 
to survey the nature and variety of children’s 
moral responses to transgression, particularly 
with respect to their internal or external orien- 
tation, and to examine their relationship to the 
child’s positions in society. 

A theoretical framework emphasizing differ- 
ential patterns of social reinforcement was 
used to develop certain parallel implications of 
socioeconomic status and sex role for the 
child’s moral orientation. Higher status posi- 
tions, as located in the middle class and the 
masculine sex role, were viewed as providing 
greater reinforcement of responses charac- 
terized by control over one’s own actions as 
well as over the external environment. Lower 
status positions, as located in the working 
class and the feminine sex role, were considered 
more likely to foster responses characterized 
by the perception of one’s actions, and their 
reinforcements, as externally determined. 
These variations in response tendencies were 
expected to have significance for the child’s 
reactions to transgression. Parental discipline 
was singled out as one of the means by which 
various moral responses might be inculcated 
in the child. A distinction was made between 
two major types of discipline. Induction tech- 
niques subsumed methods that should tend 
to induce reactions to transgression that could 
readily become independent of their original 
external stimulus sources. Sensitization tech- 
niques were those that should merely sensitize 
the child to the painful external consequences 
of transgression. 

In order <o circumscribe a wide variety of 
moral responses, a highly structured projective 
story completion device was constructed and 
administered to 122 sixth grade children 
equally representing the two major socio- 
economic classes and both sexes. In each 
story, the central figure committed, with 
minimal or no justification, a socially pro- 
hibited act of aggression. The responses to 
transgression occurring in the story comple- 
tions, which could be classified with high 
reliability, were assumed to reflect the sub- 
jects’ own strongest response tendencies. The 
methods of discipline applied to the children, 
as reported in questionnaire interviews with 
their mothers, were also reliably classified, 
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and the mothers divided into those whose 
disciplinary techniques were predominantly 
of the induction or of the sensitization type. ; 

Self-evaluation was found to play only a 
restricted role in internalized responses. to 
transgression; many forms of response do not 
seem to require the use of cognitive resources 
of moral judgment. The most common kind 
of moral response, correction of deviance, 
occurred very often without any evidence of 
self-criticism. 

Various forms of external resolution, in 
which the consequences of transgression were 
perceived in discovery and punishment by 
other people or in unpleasant fortuitous events, 
or in which others were held responsible for 
the transgression or for similar actions, were 
also a pervasive aspect of moral responses. 
The frequent coincidence of such externally 
oriented responses with self-criticism did not 
support the view, expressed in some conceptu- 
alizations, that they are merely “defenses 
against guilt” whose function is to avoid self- 
evaluation. The importance of external cues 
and resources in determining responses to 
transgression was again confirmed by the 
many instances in which moral actions were 
initiated upon the influence or demands of 
others, publicly displayed, or carried out with 
the assistance of external agents. 

The limited usefulness of self-evaluation in 
accounting for internalized moral responses, 
and the extensive evidence of their frequent 
external orientation, suggested a theoretical 
perspective in which self-criticism was not 
viewed as a prerequisite of their occurrence, 
but was regarded instead as only one of a 
variety of responses that might be collectively 
referred to as moral consequences in order to 
emphasize their functional equivalence in 
reducing the negative affect invariably asso- 
ciated with transgression. 

The two dimensions of status represented 
in social class and sex role distinctions had the 
expected parallel significance for the child’s 
moral orientation. The middle-class children 
and the boys showed more evidence of in- 
dependence of external events in responding to 
transgression, while the working class children 
and the girls revealed a greater degree of 
external orientation. Social class differences 
were generally consistent across both sexes 
and sex differences were consistent at both 
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social class levels. None of the differences was 

attributable to the variable of intelligence. 

Obvious differences in the behavioral im- 
plications of social class and sex role, despite 
similarities along the status dimension, were 
emphasized by the fact that the parallel varia- 
tions in moral orientation were not identical 
in the specific forms which they assumed. 
The marked social patterning of the moral 
responses was interpreted as supporting the 
view that their different forms should be 
treated as distinct moral phenomena and not 
as equivalent reflections of an underlying 
unitary phenomenon such as “conscience.” 
The relationships between moral responses 
and social positions also suggested that dif- 
ferent moral orientations do not emerge se- 
quentially with advancing age or experience, 
as has been argued in some interpretations 
of moral development, but are rather the 
stable end-results of different patterns of 
social reinforcement. 

The associations of moral responses with 
maternal discipline, while in the expected 
direction, were less extensive than the associa- 
tions with social pes.tion, and to the extent 
that retrospective parental report can be 
relied on, it appeared that disciplinary tech- 
niques, in themselves, provided only a partial 
view of the mechanisms through which differ- 
ences in moral orientation were generated. 
Specific methods of punishment may provide 
only a gross picture of the total pattern of 
social reinforcement affecting the child’s moral 
orientation, and experimental methods may 
be necessary in order to make more refined 
discriminations. 
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i. FIELD DEPENDENCE AND INTELLECTUAL FUNCTIONING! 
OW 
DONALD R. GOODENOUGH anp STEPHEN A. KARP 
pe" State University of New York, Downstate Medical Center 
in 
31- n their early work on individual differences Commenting on the findings of Woerner and 
in perception, Witkin, Lewis, Hertzman, Levine (1950 unpublished), Witkin et al. sug- 
ical Machover, Meissner, and Wapner (1954) gest that some intellectual tasks may also re- 
“vm found a number of personality correlates of quire the capacity for overcoming embedding 
‘ field dependence. Most of their results have contexts. They suggest that the relationship 
1.1. since been confirmed (e.g., Gruen, 1955; between IQ scores and field dependence meas- 
fled Gardner, Holzman, Klein, Linton, & Spence, ures may be a function of this common re- 
— 1959). In addition, studies have since been re- quirement. 
ego ported exploring relationships between field On the basis of this interpretation, we would 
om dependence and many other aspects of indi- expect that some WISC subtests reflect the 
ska | yidual functioning. Of particular concern here capacity for overcoming embeddedness. Such 
are those studies bearing on intellectual func- subtests should be highly related and fac- 
= tioning. torially similar to measures of field depend- 
The evidence now available suggests that ence. 
field dependent subjects tend to perform less It may be noted that the various Wechsler 


effectively on standard tests of intelligence scales (Bellevue, WAIS, and WISC) have 
than field independent subjects. Woerner and been studied extensively by factor analysis 
Levine (1950 unpublished) found significant (see Cohen, 1957, 1959, for example). Most of 
relationships between field dependence meas- these analyses are in agreement in identifying 
ures and IQ scores on the Wechsler Intelligence three major factors. 
Scale for Children (WISC) using 12-year-old The first of these factors, most often defined 
subjects. Similar relationships have since been by the Information, Comprehension, Similari- 
reported by several other investigators using a ties, and Vocabulary subtests, has usually 
variety of intelligence tests and subject popula- _ been labeled “Verbal Comprehension.” 
tions. A second factor, frequently defined by the 
The present study represents an attempt to Arithmetic and Digit Span subtests, has been 
examine the nature of this relationship through called “Memory,” “Freedom from Distracti- 
) factor analyses of intercorrelations among tests __ bility,” “Attention-concentration,” or “‘Con- 
of intelligence and field dependence. centration-speed.”’ 
The dimension of field dependence has been A third factor, variously labeled “Closure,” 
} defined by Witkin and his colleagues (1954) in “Performance,” “Spatial-perceptual,” “‘Non- 
terms of the capacity to overcome embedding verbal Organization,” “Visualization,” and 
contexts in perception. Subjects who easily ‘Perceptual Speed” has usually been defined 
“break up” an organized perceptual field— by the Block Design and Object Assembly 





? who can readily separate an item from its con- subtests, with Picture Completion also in- 
text—are called field independent; subjects cluded in some studies. This third factor is 
who readily accept the prevailing field or con- particularly relevant to the Witkin et al. 

)  text—who have difficulty in separating an item (1954) hypothesis. 
from its context—are called field dependent. The Block Design, Object Assembly, and 

i i tai “4 i‘ i. Picture Completion subtests, which define the 

) study was supported in part by a research Closure factor, appear to involve the capacity 


grant (M-628) from the National Institutes of Health, 
United States Public Health Service. H. A. Witkin t® Overcome embeddedness. In the Block 


is the principal investigator for this grant. Equipment Design subtest the subject is shown a reference 
) used in the studies to be reported was made available design that he must copy by the appropriate 
by the Office of Naval Research, equipment-loan con- arrangement of blocks. The reference design 
tract Nonr-1175 (00). : = . 
forms an organized whole that must be 


_ The authors are deeply indebted to B. Goodman, | pe , : 
C. Johnson, S. Nathan, F. Plumeau, and P. Plumeau ‘broken up” for effective performance. Object 


for their assistance in carrying out these studies. Assembly performance involves putting to- 
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gether a jigsaw-like series of pieces to form a 
meaningful whole. For some items the subject 
is told what the whole object is and for others, 
identification of the object is fairly obvious. 
Too rigid adherence to a conceptualized image 
of the solution may impair performance on this 
task, particularly where the parts given the 
subject do not readily correspond to units mak- 
ing up his image of the whole. In the Picture 
Completion subtest the subject is shown a pic- 
ture that has some part missing. The whole 
must be analyzed into its components in order 
to locate the missing part. 

If this analysis of the subtests defining the 
Closure factor is correct, and if the relation- 
ship between field dependence and IQ is a 
function of a common cognitive style, i.e., the 
capacity to overcome embeddedness, it should 
be possible to demonstrate that tests of field 
dependence and Wechsler Closure subtests 
define a single, common factor. 


METHOD 


The study reported here is part of a larger investi- 
gation (Witkin, Dyk, Faterson, Goodenough, & Karp, 
in press) in which individual differences in psycho 
logical differentiation are being explored. As part of 
that investigation, a number of cognitive tests, in- 
cluding the WISC and standard tests of field depend- 
ence, were administered to two groups of children. 
Group A consisted of 25 boys and 25 girls between 11.5 
and 12.5 years of age. Group B included 30 boys be- 
tween 9.5 and 10.5 years. Subjects of both groups were 
volunteers, with parental consent, drawn from a public 
school in Brooklyn, New York. 

Measures of field dependence were obtained from 
three perceptual situations, the Embedded Figures 
Test, the Rod-and-Frame Test, and the Tilting-Room- 
Tilting-Chair Test (Witkin et al., 1954) 

In the Embedded Figures Test (EFT) the subject 
is asked to locate a simple figure embedded in a complex 
geometric form. His score is the mean time taken to 
locate the simple figures for 24 test items. 

In the Rod-and-Frame Test (RFT) the subject is 
seated in a lightproof room, facing a luminous rod 
surrounded by a luminous frame. With frame and rod 
tilted, he is asked to adjust the rod to the objective 
upright. The RFT is divided into three series: where 
the subject’s chair is tilted opposite to the tilt of the 
frame; where the subject’s chair is tilted to the same 
side as the frame; where the subject is seated erect. 
The score for each series is the mean number of degrees 
of deviation of the rod from the upright when the 
subject sees it is straight. 

In the tilting-room-tilting-chair situation, the 
subject is seated in a chair within a specially con- 
structed room. The room and chair can be tilted in- 
dependently to the right or left. At the beginning of 
each of the 14 trials, the room and chair are tilted. For 


the first 8 trials, called the Room Adjustment Test 
(RAT), the subject is asked to adjust the tilted room 
to the upright, while the chair remains tilted. The 
remaining trials, called the Body Adjustment Test 
(BAT), involve adjustment of the chair (and thus the 
subject’s body) to the upright while the room remains 
tilted. Both RAT and BAT are divided into two series, 
Series 1 of each test including trials where room and 
chair are initially tilted to opposite sides and Series 2 
where they are tilted to the same side. For each series 
of each test the subject’s score is the mean deviation. 
in degrees, of his estimates from the true upright. 

The EFT, RFT, and BAT each require the subject 
to separate an item (whether a simple figure, a rod, or 
the body) from an embedding field and have been used 
to define perceptual field dependence. Witkin et al. 
(1954) call these “part-of-a-field” tests. In contrast, 
the RAT has been called a “field-as-a-whole” test and 
apparently does not directly involve the capacity to 
overcome an embedding context. 

In all, the correlation matrix for Group A involved 
21 variables, 12 WISC subtests, 3 series of the RFT, 2 
series each for the BAT and RAT, EFT, and Sex. 

Each of the 10-year-old Group B boys received the 
WISC and field dependence tests just described. 
However, Series 1 and 2 of the RFT were combined 
with equal weight and treated as a single variable. BAT 
and RAT series were similarly combined, yielding one 
score for each test. These changes were made in the 
interest of economy, after inspection of the Group A 
results suggested similar patterns of correlation for 
the several series of each test. 

In addition to these variables, data from a number 
of other cognitive tests were available for almost all of 
the Group B subjects. These tests had been adminis- 
tered as part of the larger study from which all of the 
data reported here were taken. The additional cognitive 
variables seemed sufficiently interesting and relevant 
to warrant their inclusion in the factor analysis despite 
the problem created by the loss of one or two subjects 
on each test. 

1. The Children’s Embedded Figures Test, devel- 
oped by Goodenough and Eagle (in press), is a modi- 
fied version of the EFT designed for use with chil- 
dren. As in the EFT, the subject is required to locate 
a simple figure embedded in a complex one; however, 
the complex figures are meaningful objects from which 
the simple figures may be removed. The score is the 
number of simple figures correctly identified. 

2. Recognition-efficiency test: nonserial version 
involves exposure of a series of pictures at an out-of- 
focus lens setting, subjects being asked to identify each 
picture. The test is scored for the number of pictures 
correctly identified after a single presentation. 

3. Recognition-efficiency test: serial version, devel- 
oped by Gump (1955), is similar to the nonserial 
version; however, eath picture is presented in in- 
creasingly sharp focus until identified correctly. High 
scores reflect early (relatively out-of-focus) recogni- 
tion? 

4. In the Hidden Pictures Test (Thurstone, 1944) 


* We are indebted to P. V. Gump for the loan of 
his recognition-efficiency test material. 
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the subject is asked to discover familiar objects and 
faces hidden in a complex scene. 

5. The Incidental Learning Test involves a modifi- 
cation of a technique used by Gardner et al. (1959). The 
subject is briefly shown a series of eight common three- 
letter words, each of which is printed in one of four 
colors. During the presentation, he is asked to name 
the color of each word. Following a single exposure of 
the series, the subject is tested for recall of the words 
his score being the number of words recalled. 

6. The Intentional Learning Test uses the same type 
of materials and follows the same procedure as inci- 
dental learning, except that, prior to exposure of the 
list of words, the subject is instructed to name and try 
to remember the words (ignoring the colors). Again, 
the subject is tested for recall of words after one pres- 
entation of the stimulus series, his score being the 
number of words recalled. 

7. Reconciliation of Opposites was patterned after 
the Stanford-Binet subtest of that name with several 
easy items added to make it appropriate for 10-year- 
olds. The subject is read pairs of opposites (e.g., black- 
white) and asked to tell how they are the same or 
similar. Scoring is consistent with criteria used by 
Terman and Merril! (1937). 

8. The Cancellation Test, patterned after a satiation 
situation, uses a series of four pages filled with lines 
of random letters. The subject is instructed to cross 
out all the “t’s’” on the page as quickly as he can. 
Reductions in time and errors from the first to fourth 
pages are scored and combined with equal weight to 
provide an improvement score. 

The correlation matrix for the Group B subjects was 
based on 25 variables. 

Two centroid factor analyses (one for each group) 
were carried out. “Blind” rotation to oblique simple 
structure followed the graphic method outlined by 
Cattell (1952). The decisions to terminate factor 
extraction were based on Saunders’ test. 


RESULTS AND DISCUSSION 


Eight factors were extracted from the Group 
A correlation matrix and nine factors from the 
Group B matrix. Despite the small NV, the 
hyperplanes seem reasonably well-defined, over 

% of the factor loadings in each study falling 
between +.10. Rotated loadings for Groups 
A and B are presented in Tables 1 and 2, 
respectively.* 

The first three factors in each study are 
closely matched and apparently correspond to 


*Six tables, including zero-order intercorrelation 
matrices, unrotated factor loadings, and intercorrela- 
tions among factors for Groups A and B, have been 
deposited with the American Documentation Insti- 
tute. Order Document No. 6864 from ADI Auxiliary 
Publications Project, Photoduplication Service, Library 
of Congress; Washington 25, D. C., remitting in ad- 
vance $1.25 for microfilm or $1.25 for photocopies. 
Make checks payable to: Chief, Photoduplication 
Service, Library of Congress. 





the three major factors repeatedly isolated in 
previous studies of the Wechsler scales. 


Factor 1 


The first factor, on which the Vocabulary, 
Information, Similarities, Arithmetic, and 
Comprehension subtests of the WISC are loaded 
in both studies, is apparently the verbal com- 
prehension factor isolated in previous studies 


of Wechsler scales. Although the loading of 


Arithmetic on this factor is not frequently re- 


ported, Cohen (1959) obtained this result with 


10.5-year-olds, 

Among the variables included only in the 
Group B study, Reconciliation of Opposites 
and Intentional Learning were loaded on Fac- 
tor 1, a result consistent with the verbal na- 
ture of these tests. Sex is loaded on this factor 
for Group A, with boys displaying greater 
verbal comprehension than girls. This result is 
not consistent with the literature on sex 
differences in verbal comprehension, 


Factor 2 

This factor, loaded in both studies by Digit 
Span, Arithmetic, and Coding subtests of the 
WISC, is apparently the attention-concentra- 
tion or memory factor previously isolated in a 
number of studies. The loading of Coding on 
this factor, though infrequent in earlier studies, 
appears reasonable. Similarly, the loading of 
the Children’s EFT in the Group B study 
seems reasonable in view of the apparent atten- 
tion requirement of this test. It is surprising, 
however, that the EFT is not loaded in either 
study. 

In addition to the Children’s EFT, four 
other variables included only in the Group B 
study appear to be loaded on Factor 2. Inten- 
tional Learning is loaded positively and non- 
serial Recognition-efficiency, Cancellation, and 
Sex negatively (girls attending more success- 
fully than boys). The latter result appears con- 
sistent with the literature on sex differences in 
attention. It also seems reasonable to expect 
that ability to concentrate would aid perform- 
ance in the Intentional Learning situation. 


Factor 3 


The three tests of perceptual field depend- 
ence, RFT, EFT, and BAT, have their highest 
loadings on this factor as do two other per- 
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TABLE 1 
Group A Factor Matrix AFTER FI‘At KOTATION 


























Factors I 
Test : C 
1 2 3 4 5 6 7 8g 
RAT: Series la —06 —08 06 58 —04 34 11 09 | 
RAT: Series 1b —O5 10 —03 67 11 41 19 —10 ” 
BAT: Series 2a —41 —04 39 02 —10 60 09 28 
BAT: Series 2b 09 —06 44 08 08 52 —07 05 
RFT: Series 1 00 —0S | 68 02 | 09 —03 ~11 ~10 
RFT: Series 2 —06 oo | 50 06 34. | —26 09 —04 
RFT: Series 3 11 —30 | 68 | —S4 02 00 06 26 a 
EFT —07 10 50 —17 10 —03 —06 49 te 
Information 65 04 —08 05 —03 | —05 06 21 
Comprehension 38 29 —24 —16 21 —03 —03 —10 = 
Arithmetic 39 41 03 09 —10 —10 —09 14 i p 
Similarities 44 | 08 03 —44 09 08 08 09 ‘ 
Vocabulary 63 —03 —06 —09 —07 10 10 ~23 4 
Digit Span —06 62 01 | -—02 01 04 09 —18 bx 
Picture Completion 35 —39 $8 | -10 | 31 05 65 —05 w 
Picture Arrangement 36 | -17 09 | —-07 58 10 00 —(2 
Block Design | 07 =| 00 42 01 —09 0s | 04 54 ? ? 
Object Assembly -07 | -03 57 | 00 61 | 02 -0s | os f 3 
Coding 04 =| 39 CO —03 | 0s —06 07 71 09 DI 
Mazes 07 (| 24 57 | 03 —03 09 0S —06 
Sex 42 | —44 | 0S | 09 —04 04 | —08 | O4 - 
— —— —_— — a 7 —— = at 
Note.—In Tables 1 and 2 decimals are omitted; loadings over +.20 are in boldface type; and tests are reflected where neces i¢ 
sary so that positive scores represent greater field independence, greater intelligence, better memory, more efficient recognition, or 3 
masculinity ' 
TABLE 2 : 
Group B Factor MAtTrix AFTER FINAL ROTATION 
Factors D 
Test —_— —__—_—__—_ - $$ —— w 
1 2 3 4 s 6 7 8 9 , 
RAT |—59/ 09] $7/-0s| 10} 20|-33| 10| 0 ol 
BAT 09 09 43 10 | —07 33 og8 | —02|-08 | W 
RFT: Subject tilted (Series 1 and 2) | —08 | —10 74 | —06 34 01; 02; 10; -02 | g 
RFT: Subject erect (Series 3) | —05 03 69 34 10 23 03 | —09 | -25 | 
EFT | —03 09 69 03 | —03 | —24 03|-07; #» } © 
Information | 67) 10| 07|-10) O1/-19] 10) O4| 0 al 
Comprehension 46 01 | —382 | —09; 04; O02; -—O1; Ol 61 ir 
Arithmetic | 53 46 06 00 | —30 | —03 os | —03 | —380 : 
Similarities 57 | 05 06 39 01 10 | —41 | —06 | —05 p 
Vocabulary | 7/|—-os|—os! 06! 06/ —-10| -10| 18|-@ ? 
Digit Span | 25 63 | —06 | —02 01 06 | —07| 00} -—% m 
Picture Completion | —07 | —07 52 | —09 | —46| 08 10 0s | —04 
Picture Arrangement | —O1 | —10 11 50 | —03 | —10 | —05 | —01 10 In 
Block Design | 09 42 60 | —10| —o1 | —05| —09; 05; -08 } gq 
Object Assembly | 06 09 33 52 | —38 08; 07] 08) 
Coding —06| 43) -10) 04) -25| 00| —24| -02) 2% - 
Mazes | —06 10 | —10 | —09 | —08 $2; -—02| 45 | —05 te 
Recognition-efficiency : serial 24 03 | —40 | —03 | —30 20| —08 | 03 06 lo 
Recognition-efficiency : nonserial 06 | —52 07 07 | —08 02 00 23 05 
Cancellation —07 | —38 | —04 | —08 | —09 | —88 | -—10 | -—09| 90 \ 
Incidental Learning 10 | —08 29 | —25 09 02 | —03 22 19 v 
Intentional Learning 37 31 10 | —18 10 | —26 | —23 | —07 | -22 th 
Reconciliation of Opposites 64  —02 04 29 08 03 14 | —08 09 
Children’s EFT —04 29 61 01 | —04 00 $3 | —17 0 | i 
Hidden Pictures 10 | —01 27 02 | —10 | —10 | —87 | —21 | -—® s 
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ceptual tests, the Children’s EFT and Hidden 
Pictures, which are very similar to the EFT. 
Three WISC subtests, Block Design, Picture 
Completion, and Object Assembly are also 
highly loaded on Factor 3. The latter tests 
have often defined a closure factor in earlier 
studies. As suggested above, all of these tests 
may have, as a common requirement, the over- 
coming of an embedding context. 

The absence of RAT from Factor 3 in one 
analysis and its low loading in the other tends 
to confirm Witkin’s distinction between “‘field- 
as-a-whole” and “part-of-a-field” tests. Ap- 
parently the RAT is different from the other 
perceptual measures, although relationships 
between this test and field dependence meas- 
ures have been obtained in many studies. 

Several other tests appear loaded on Factor 
3 in one or both analyses. These include Com- 
prehension (loaded negatively in both analy- 
ses), serial Recognition-efficiency and Inci- 
dental Learning (given only to Group B sub- 
jects), and Mazes (leaded for Group A but 
not B). The bases for these loadings are not 
clear at present. 

The remaining factors are not easily matched 


—_— 














for the two groups. Nor can they be readily 
identified with factors obtained in earlier 
studies. Some of these factors may be spurious, 
particularly in the Group B study where data 
were missing for some of the tests. The possi- 
bility that failure to match may be a function 











of differences in the variables included in the 
two studies or a function of developmental 
differences should also be considered. In any 
event, after the first three factors, the results 
are not clearly interpretable and do not seem 
immediately relevant to the purposes of the 
present study. 

The hypothesis that there is a factor, com- 
mon to intellectual and perceptual tests and 
involving the capacity to overcome an embed- 
ding context, receives some support from the 
results of these studies. Each of the standard 
tests of perceptual field dependence is highly 
loaded on Factor 3. The three WISC subtests 
which have defined a closure factor in previous 
work also appear on Factor 3. Furthermore, 
they are the only WISC subtests substan- 
tially loaded on Factor 3 in both studies. Nor 
are the tests of field dependence consistently 
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or substantially loaded on the verbal compre- 
hension or attention-concentration factors. 
These results tend to support the Witkin hy- 
pothesis that relationships obtained in many 
studies between tests of field dependence and 
standard tests of intelligence stem, at least in 
part, from common requirements shared by 
measures of field dependence and of certain 
kinds of intellectual abilities. 

These results do not exclude the possibility 
that some intellectual tasks may be related to 
measures of field dependence for other reasons. 
For example, some of the personality corre- 
lates of field dependence may influence per- 
formance on particular types of intellectual 
problems. In fact, the high relationships be- 
tween the Incidental Learning Test and meas- 
ures of field dependence might be interpreted 
on this basis. A detailed discussion of such pos- 
sibilities is presented elsewhere (Witkin et al., 
in press). 


SUMMARY 


The present study was designed to test the 
hypotheses that some intellectual and _per- 
ceptual tests have a common requirement for 
overcoming embedding contexts, and that re- 
lationships obtained between measures of field 
dependence and standard tests of intelligence 
are based on this common factor, Two factor 
analyses were conducted on matrices of correla- 
tions between cognitive tests, including tests of 
field dependence and the subtests of the WISC. 
The results tend to support both hypotheses. 
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HAT one member of a group may do 
with impunity another may not do. 
This observation fits many circum- 
stances of social interaction and serves as a 
central element in the “idiosyncrasy credit” 
model of perceived status and conformity 
(Hollander, 1958). In terms of the model, evi- 
dences that a person is competent in some focal 
group activity, and that he has conformed to 
applicable group expectancies, result in his 
accumulating credits that he may draw on later 
for innovative behaviors directed at exercising 
influence. Moreover, such accumulations lead 
in time to alterations in expectancies that 
make certain deviations acceptable as a fea- 
ture of higher status. Recently, these relation- 
ships have been studied in an experiment with 
task oriented groups (Hollander, 1960). 
Another illustration of the operational as- 
pects of perceived status is provided by an 
experiment reported in Pepitone (1958, p. 266) 
where subjects were presented with a script of 
a technical discussion and were given varying 
prior sets as to the level of expert qualification 
of one of the two participants; the greater the 
alleged expertness of that person, the more 
favorable were the subjects’ interpretations of 
his negative acts as well as of his positive ones. 
The present experiment follows a simple, 
two-step approach in further studying differ- 
ences in response to behavior as a function of 
the perceived status of the behaving person. 
First, it may be predicted that attributing 
higher degrees of competence to a person 
should result in correspondingly greater will- 
ingness to accept that person’s authority. 
Further, this effect should be enhanced for 
each degree of ascribed competence the longer 
'The research upon which this paper is based was 
completed by the author as principal investigator of 
ONR Contract 816(12) with Washington University. 
The opinions expressed are not to be construed as 
necessarily reflecting those of the Department of the 
Navy. 
"It is a pleasure to acknowledge the valuable aid 
of Robert M. Taylor in accomplishing the work asso- 
dated with this study. 
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SOME EFFECTS OF PERCEIVED STATUS ON RESPONSES TO 
INNOVATIVE BEHAVIOR' 


E. P. HOLLANDER? 


School of International Service, American University 


the person is known to have belonged to the 
group; all things constant, the equally com- 
petent newcomer to the group should have 
less authority than his counterpart who has 
been there longer. Second, the status thus 
accorded the person described should be in- 
versely related to disapproval of his deviancy, 
ie., the higher the status, the less the disap- 
proval. This relation should hold especially in 
the case of innovative deviancy. In short, the 
intent is to induce levels of perceived status 
so as to discern their effect upon responses to 
given behaviors. To produce status experi- 
mentally, the first step of this approach makes 
use of the direct “trait description” technique 
employed by Asch (1946), in his study of im- 
pressions of personality, and more recently by 
Bruner, Shapiro, and Tagiuri (1958). 


METHOD 
Subjects, Setting, and Set 


One hundred and fifty-one undergraduate students 
enrolled in lower-level psychology courses at Washing- 
ton University were utilized as subjects. Sixty-four 
were males, and 87 females. All fell within the age 
range 18-23, with a mean about 20. The entire proce- 
dure was conducted in class sections, after an intro- 
ductory statement by the investigator that this was a 
study of attitudes in groups and that responses would 
be anonymous. Forms with precise instructions were 
then distributed. 


Procedure 


The first form instructed subjects to think of a 
group to which they belonged, at the time or before, 
and to imagine in it a person of their own sex who 
was described by a succinct set of terms which fol- 
lowed. Then the form asked: “Knowing this informa- 
tion, how willing would you be to have this person in a 
position of authority in the group?” Responses were 
made on a seven-point scale labeled “definitely not” 
at one end, “very willing” at the other, and “neutral” 
in the middle. 

Each subject received a single description made up 
of four terms, of which two were varied, and two held 
constant, throughout. Competence in the group’s 
activity, arrayed at four degrees, was paired with one 
of two levels of time in group to provide eight descrip- 
tions of the stimulus person within a 2 X 4 design: 
thus either “been in group for some while” or “new to 
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group’ appeared in combination with “extremely 
capable performer in group’s activity,” or “capable 


performer in group’s activity,” or “average performer 
in group’s activity,” or “poor performer in group’s 
activity.” The constant terms, “interested” and 
“generally liked,’”’ were used to control for two other 
determinants of status postulated in the model, 
“motivation to belong” and “beta value.”* Examples 
of the resulting descriptions are given by the two ex- 
tremes below: 


Extremely capable performer in group’s activity. 
Been in group for some while. Interested. 


liked. 


Generally 


Poor performer in group’s activity. 
New to group. Interested. 
Generally liked. 


Having made their ratings, subjects were then given 
a second form instructing them to think again of the 
same group with the person of their own sex once again 
described exactly as before. Eight behaviors were pre- 
sented to afford a range of stimuli for responses linked 
to status. The first of these—“suggests changes from 
group plans’”—was of special significance in terms of 
the influence aspect of status. The others, in order, were: 
“speaks freely about other group members,” “questions 
the views of other group members,” “performs tasks 
independently,” “interrupts to express comments at 
group meeting,” “discusses group concerns with out- 
“makes own decisions on attending group 
functions,” and “casual in relations with other group 
members.”’ All were selected as illustrative of intra- 
group behaviors that might be variously interpreted 
and were based on responses to an open-ended question 
from a similar but independent group of subjects. In 
this second step, subjects indicated for each of these 
eight behaviors whether their evaluation of the stim 
ulus person would go up or go down, and by what per- 
centage weight, if he displayed it. 

Scores for accorded status were readily derived 
from the rating responses made to the question con 
cerning willingness to have the stimulus person in a 
position of authority. These were scaled from 7 for 
“very willing” to 1 for “definitely not” with 4 assigned 
to “neutral.” The distribution of these scores was in- 
clined toward the upper values with a mean of 4.5, a 
median and mode of 5, and a standard deviation of 1.6. 


siders,”’ 


RESU}.TS AND DISCUSSION 


Table 1 presents the means for the accorded 
status score by experimental treatments, the 
Ns for which varied from 17 to 22. In both 
columns the values proceed steadily upward in 
line with the rising degrees of competence 
ascribed to the stimulus person. At each de- 
gree, those said to be in the group longer receive 
a higher mean score on this status measure. An 
analysis of variance completed for these data 

*See Hollander (1958) for a further explication of 
these variables, and Hollander (1961) for a detailing 
of their background in previous research. 


E. P. HOLLANDER 


TABLE 1 
AccorpED STATUS BY 
TREATMENTS 


MEANS FOR EXPERIMENTAL 


In group for 


Treatment some while New to group 
Extremely capable _per- 6.25 5.84 
former 
Capable performer 6.11 5.50 
Average performer 5.06 4.50 
Poor-performer 2.95 2.53 


yielded significant F values for both major 
variables, but not for sex difference or any of 
the interaction terms. 

Because of marked intersubject variability 
in the “percentage weight” response, the meas- 
ure of disapproval employed for the second 
part of the experiment was the percentage of 
subjects responding “down” for each behavior, 
by the level of status previously accorded the 
stimulus person. These data were viewed in 
two ways. The difference in disapproval for 
each behavior, across status, was tested for sig- 
nificance by chi square, revealing only “sug- 
gests changes” to have a value significant be- 
yond the .05 level. Second, levels of accorded 
status were correlated (rho) with ranks for 
magnitude of disapproval. A high negative 
coefficient signifies a decrease in disapproval as 
status increases, in keeping with prediction. 
This was in fact the case for two behaviors 
with significant coefficients: the key innovative 
behavior “suggests changes from group plans” 
(—.96), and “discusses group concerns with 
outsiders” (—.74). The pattern of rho values 
for the other behaviors was not uniformly 
negative. Of the three with positive signs, only 
one approximated significance, that being 
“interrupts to express comments at group 
meeting” (-+.68). 

In Figure 1 curves are plotted for the two 
behaviors with significant values for rho and 
for the one with a value nearly so. Most marked 
is the curve for “ the be- 
havior most closely tied to the assertion of 
influence. Disapproval of this behavior drops 
off quite systematically with rising accorded 


suggests changes,” 


status. A similar though less sharp effect is 
seen for “discusses with outsiders,” a behavior 
connected with the spokesman or advocate 
function; while not approved at any level, at 
the two lowest reaches of accorded status this 
is disapproved by all respondents. 
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ACCORDED STATUS 


Fic. 1. Percentage of respondents giving downward evaluation of stimulus person for displaying indicated 


behaviors, by status accorded the stimulus person. 


A contrasting effect is seen in the positive 
curve for “interrupts,” suggesting that this 
behavior is less likely to alter the evaluation of 
a person of lower status than one of higher 
status. This interpretation fits the proposition 
that shifts in expectancies are associated with 
increased status. In this instance, and as part 
of a probable democratic image of leadership, 
it becomes more crucial in conserving credits 
for the high status member than for the low 
status member not to interrupt others in the 
group. 

The findings of this experiment are in line 
with the elements of the model considered. Of 
particular note is the evidence of directly dis- 
cernible inputs to status, from competence and 
time in group, that result in consistent re- 
sponse tendencies. These effects are the more 
noteworthy for having been produced irre- 
spective of the nature of the group—a choice 
left to each subject individually—and with 
only minimal stimulus material.‘ 


SUMMARY 


An experiment was conducted to test pos- 
tulates from the “idiosyncrasy credit” model 


‘This consideration encourages the prospect of 
appropriating this technique as a classroom demon- 
stration. 


of status. A brief description of a person they 
were to imagine in any group to which they be- 
longed was presented to 151 subjects of both 
sexes. Competence and length of time in the 
group were the major attributes manipulated. 
These were paired in eight descriptions, only 
one of which was given to each subject as a 
treatment. Responses were then made on a 
seven-point scale to signify willingness to have 
that person in a position of authority in the 
group, and this served as an index of accorded 
status. A rising mean score for accorded status 
was found for the increasing degrees of com- 
petence, and the mean for “new to group” 
was uniformly lower than that for “in group 
for some while” at each degree. The effects of 
these two major variables was significant, while 
sex difference was not. Subjects also provided 
an evaluation of the same person in terms of 
eight possible ways he might behave in the 
group. According to prediction, two behaviors 
reflecting innovative action were found to be 
disapproved significantly less the higher the 
status attributed to the innovator. 
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HYPOTHESES AND HABITS IN VERBAL “OPERANT CONDITIONING” 


DON E. DULANY, Jr. 


University o, 


rubies of verbal “operant conditioning” 

(Adams, 1957; Krasner, 1958) raise two 

fundamental questions: Is this learning 
without awareness? And how well do empirical 
perant principles describe the data obtained 
ina verbal conditioning experiment? Possibly 
the effects were under the subject’s own verbal 
ontrol. 

In the studies most faithful to the operant 
conditioning paradigm, subjects said words ad 
\ibitum and rate of plural nouns increased when 
followed by a buzzer (Greenspoon, 1951), a 
ight (Greenspoon, 1951; Sidowski, 1954), or a 
casually murmured “Umhmm” (Greenspoon, 
1955; Mandler & Kaplan, 1956; Wilson & Ver- 
planck, 1956). Greenspoon (1955) also showed 
that Umhmm was an effective reinforcer for 
other words. It is unsurprising to learn that 
one can influence the actions of another co- 
perative human being by informing him 
when his behavior is judged correct or accept- 
able. These studies attract attention because 
they apparently demonstrate learning without 
awareness in some sense. Some see in this a 
program for identifying response classes and 
empirical reinforcers at the human level 
through the use of operant conditioning pro- 
cedures. As Verplanck (1955) puts it, “This is 
the identification of responses and reinforcing 
stimuli, and the verification and education of 
laws relating them to one another in human 
behavior under conditions where the subject is 
acting as naturally as possible, and where inso- — 
far as possible, he is not ‘aware’ of what is 
going on” (p. 597). Salzinger (1959), in discuss- 
ing the verbal conditioning literature, con- 
cludes that, “Very little deviation from the 
approach in animal work has been necessary 
80 far to obtain lawful data in verbal behavior”’ 
p. 70). But those accustomed to wonder about 
reinforcing mechanisms may think these find- 
ings antecedently improbable. Umhmm under 

‘This study was supported by grants from the 
National Science Foundation and the United States 
Public Health Service. Richard Erkes collected the 
data of Experiment I; Janet Pierce and Margo Tuite 
ed the data of Experiment IT and assisted with 
Statistical computations. 
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several circumstances could be a pleasant thing 
to hear, but it is unclear how a buzzer or a flash 
of a 75-watt bulb would activate any of the 
processes variously assumed to be the basis for 
a law of effect that is more than empirical. The 
improbable is not to be dismissed, certainly, 
but it does invite an examination of alterna- 
tive explanations—in this case the possibility 
that the operant conditioning effect was medi- 
ated by some kind of verbal control—hypothe- 
ses and self-instructional sets, perhaps together 
with the transfer of prior habits. 

There are hints of some alternative in the 
greater effectiveness of these contingent stim- 
uli in tasks that are puzzling enough to arouse 
problem solving. When subjects in our labora- 
tory course were asked to say words, and were 
permitted to express their hypotheses through- 
out, they seemed to grasp at any figural stimu- 
lus, even a light or buzzer, as a sign that would 
make sense of a silly, meaningless task. Sidow- 
ski (1954) found that uninstructed subjects 
conditioned as well as those advised that the 
light could be made to blink, and he had reason 
to believe that the “uninstructed” were self- 
instructed. With the more meaningful task of 
telling stories, Ball (1952) found a light to be 
ineffective as a reinforcer. Neither did a light 
influence response to a self-acceptance inven- 
tory (Nuthmann, 1957) or selection of pro- 
nouns while the subject formed complete sen- 
tences (Taffel, 1955). And Hildum and Brown 
(1956) found that Umhmm did not modify 
response to an attitude questionnaire, again a 
task the subject usually accepts at face value. 

In the first experiment subjects are asked to 
say words and plural nouns are followed by 
Umhmm. A number of questions are asked and 
the subjects are divided into groups depending 
on what they say that they were supposed to 
do in the experiment. We should like to know 
whether these reports of behavioral hypotheses 
are related to selection of plural nouns as we 
would in theory expect such hypotheses to be. 
Theory in this case is not formal and well- 
developed, but merely a set of theoretical 
propositions of long standing—that subjects 








under selective reinforcement tend to form be- 
havioral hypotheses, that these hypotheses 
tend to be accompanied by corresponding self- 
instructional sets (or “intentions’’), and that 
these in turn lead to selection of the correspond- 
ing response class.? Thus, with the additional 
hypothesis that subjects tend to report their 
hypotheses when questioned, subjects who re- 
port that they are supposed to say plural nouns 
should say them. But other behavioral hy- 
potheses as well must be examined for the pos- 
sibility that they are “correlated hypotheses” 
(Adams, 1957). Selection of another response 
class might, because of the subjects’ prior or- 
ganization of habits, also entail selection of 
plural nouns. If any report of a behavioral 
hypothesis is found to be related to selection of 
plural nouns, we shall, in a second experiment, 
instruct subjects to use the reported response 
class. Instruction, too, is commonly hypothe- 
sized to produce sets, and sets, in turn, to result 
in response selection. The term, “verbal con- 
trol,”’ summarizes the set of theoretical prop- 
ositions involving behavioral hypotheses and 
instructional sets (self or social). It is used 
here because it suggests the reporting and ma- 
nipulating operations used and directs atten- 
tion to the subjects’ verbal processes. The hy- 
potheses and sets under consideration, might, 
however, be verbal, partially verbal, or en- 
tirely cognitive-neural—though in this situa- 
tion, one would think, verbalizable.® 


EXPERIMENT I 
Method 


Subjects. The subjects were 60 male undergraduates 
in the introductory psychology course, 43 experi- 
mentals and 17 controls. 

Setting and procedure. The experimental room, simi 
lar to that described by the other experimenters, was 
8’ X 0’ X 10’, illuminated by a single hanging bulb, 
and bare except for two chairs and a table with a 


2 The hypothesis of association of behavioral hy- 
potheses and self-instructional sets probably requires 
the assumption of a fair amount of task motivation 
on the part of the subjects. 

* William James’ classical point that language may 
be inadequate for conveying complex awareness has 
been applied by Eriksen (1958) to the use of retro 
spective criteria in studies of this type. The point is 
most troublesome for claims of learning without aware 
ness. In this case, however, “I am supposed to say 
plural nouns” seems simple to articulate, and failure 
to do so seems strongly to imply lack of awareness 
When reports of awareness are found to be related to 
behavior in the way awareness should be, the point 
loses some force. 
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microphone and tape recorder. The experimenter wa; 
a male senior in psychology who was mature in manne; 
and appeared to be in his early twenties. He sat behir 
the subject and recorded his responses unobserved 

After a few minutes of conversation to establish 
rapport, the experimenter gave the following instruc. 
tions taken from Greenspoon (1955): “What I want 
you to do is to say all the words you can think of. Do not 
use any sentences or phrases. Do not count. Please 
continue until I say stop. Go ahead.” A few, either 
bafiled or perverse, had to be reminded several times 
not to give phrases. 

The four published experiments of this type have 
varied in length and manner of scoring. Greenspoon 
(1955) solicited 50 minutes of responding and plural 
nouns were tallied by 5-minute periods. In the other 
three experiments subjects spoke some total number of 
words, and correct responses were scored by blocks of 
all responses. The present experiment required five 
blocks of 55 words of the subject, a chore that took the 
typical subject about 50 minutes. Scoring by blocks of 
all words has the advantage of reducing the large 
variance between subjects in output of plural nouns 
by eliminating the effect of wide individual differences 
in overall rate of uttering any words. 

For the experimental group the experimenter 
murmured “Umhmm” after each plural noun in the 
last four blocks of 55 responses. The intonation of 
Umhmm varied too much for a meaningful phonetic 
description, but it could be characterized as non 
committal to warm. While the controls were saying 
words, the experimenter remained silent. After each 
block of 55 responses, all subjects were asked, “What do 
you think the experiment is all about?” At the end of 
the session the experimenter asked that question again 
and several others: “Did you notice whether I said 
anything?” (“If so, what?” and “What do you think 
the significance of that was?’’) “Was there anything 
that you were supposed to say in order to be correct?’ 
The entire experimental session was recorded on tape 

Two people independently examined the reports— 
and, of course, independently of any knowledge of the 
subject’s performance. Their specific object was to 
sort the subjects into groups on the basis of their re- 
ported behavioral hypotheses, with a remaining cate 
gory for those who reported nothing relevant. Sorting 
by this single principle proved easier than expected 
and the two sorts perfectly agreed. 


Results and Discussion 


Manipulations and performance. First, the 
performance of the entire experimental group 
was compared with that of the controls (see 
Figure 1). There was no need to eliminate any 
subjects from this analysis because none called 
plural nouns “correct,” stated the purpose of 
the experiment, or said that Umhmm was con- 
tingent on plural nouns, the usual criteria of 
full ‘“‘awareness” and basis for exclusion. No 
two of the other studies report the same 
analysis or the same learning curve, so it 5 
possible to replicate their findings here only i0 
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Fic. 1. Frequency of plural nouns in blocks of 55 
responses. 


the general sense of showing that output of 
plural nouns came under “control” of the con- 
tingent stimulus. An experimental effect is 
shown most simply by observing that for 35 of 
43 experimental subjects, mean frequency of 
plural nouns, averaged over the four rein- 
forcement blocks, was greater than frequency 
of plural nouns at the first, nonreinforcement 
block; and that only 8 of 17 controls show the 
same increase. A chi square for unrelated 
groups of 5.48 with a p < .02 shows that this 
increase is significantly associated with experi- 
mental treatments. 

That the effect does not register as consis- 
tent differences in trend over all five blocks 
can be seen from the insignificant group differ- 
ences in linear and quadratic components of 
trend (Lines B and C of Table 1) in Grant’s 
1956) extension of the Alexander (1946) 
analysis.‘ Apparently the effect of the rein- 
forcement is to support output of plurals at 
some phase of reinforcement, but individual 
fluctuations over the four reinforcement blocks 
are too irregular and too heterogeneous for 
significant group differences in trend over all 
ive blocks, either linear or curvilinear. This 
great heterogeneity must be in response to the 
reinforcement because the term for linear slopes 
between individuals is highly significant for the 
experimental group (F = 2.96, 41 and 123 df, 
?<.0005 and insignificant for the controls 
(PF = 1.35, 15 and 45 df, p > .10) when each 
is tested against its own term for individual 

‘With heterogeneity of variance, the “true” » 
values could be a little greater than those reported, but 


a Fs to which interpretative significance is attached 
have » values of less than .01. 
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deviations from estimation. With still more 
heterogeneity in kind of individual curvilinear 
slope, which is visible in a graph of individual 
slopes, both the insignificant between-group 
and between-individual curvilinear trend terms 
are consistent with the significant chi square 
for differences in group shift. The lack of an 
overall mean difference, together with that 
chi square, also points to heterogeneity. The 
overall trend, a rise and fall in output of 
plurals, is best described by the second order 
orthogonal component (Line E). 

Subjects’ reports. The subjects’ answers to 
questioning fell into several conspicuous cate- 
gories. To Question 1, “What do you think 
this experiment is all about?,” there was a 
scattering of answers such as “To study how I 
think,” “A vocabulary test,” and “I haven’t 
the slightest idea’”’; but only one had a notice- 
able frequency. Thirty-four experimental sub- 
jects—and 11 controls—teplied that the ex- 
perimenter must be studying their associations, 
that their task was to run through their asso- 
ciations. In each case the same answer had 
been given after one or more of the earlier 
blocks. This is a likely guess, since, as Bousfield 
(1953) has shown, a subject obliged to produce 
very many disconnected words usually runs 
through associated pools; and this is all of 
much consequence the subject sees himself 
doing. 

All subjects replied to Question 2 that the 
experimenter said “‘Umhmm” and were there- 
fore asked, ““What do you think the significance 
of that was?” They suggested variously that 
Umhmm was an “encouragement to con- 
tinue,” “A distraction,” or “Of no signifi- 
cance,”’ and again only one reply was at all 
common. Eleven of the above 34 reported that 
they were supposed to associate in a series or 
in the same category whenever the experi- 
menter said “Umhmm.” These 11 subjects 
were therefore separable from the others. 
Typical replies were “I noticed that you ap- 
pear to say ‘umhmm’ when I seem to follow a 
line of more or less synonyms.... Your 
‘umhmms’ show me an indication that I’m on 
the right track. ... You seemed to be in favor 
of the more direct associations”; “I think the 
only thing is you pick at random, like I picked 
‘vegetables,’ and you wanted me to talk about 
that and every time I mentioned some vege- 


table you’d say ‘umhmm’”’; “I think you’re 
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TABLE 1 
SUMMARY OF TREND ANALYSES 
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ource ontrols Controls ) 
S| df| MS fa df| MS B i MS F df| MS P 
A. Between-group means f 1; 24.19) 0.12 1824.85 78* 1} 11.04,0.07 1/538.69'8.01** 
B. Between-group linear |g 1; 74.83) 1.82 1/582 .99)11.46*** 1} 1.350.05 1} 11.05)0.47 
slopes 
C. Between-group h 1; 33.85) 1.54 1; 16.60 0.53 1 11.940.51 1, 60.73)2.81 
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F. Between-individual ji | 58.203.72)10.61**** 26,172.61) 6.80****) 38/159.32.8.58****|\24! 67.2214. 50*** 
means 
G. Between-individual i 58 48.03, 2.50****|26) 50.87, 2.00* 38) 28.77)1.55 24| 23.47/1.57 
linear slopes 
H. Between-individual i 58) 22.02) 1.15 26 31.43) 1.24 38} 23.25)1.25 24, 21.641.45 
quadratics 
I. Individual deviations 174 19.20 78) 25.39 114 18.56 72} 14.93 
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the same category ... you arbitrarily pick sub- 
jects and try to make me say what you wanted 
me to say’’; “Well, I think that you’re saying 


group linear slopes (Line B of Table 1). The 
“associative hypothesis alone” group is hardly 
distinguishable from the control group and 








‘umhmm’ when I say a series of words that are _ neither of the between group trend terms is sig- ™ 
connected together . . . well, like I said before, nificant. There are no significant trend differ | y, 
the only thing I could think of was I’d start ences between the controls and the “no asso- or 
relating words to a definite subject and you’d _ ciative hypothesis” subjects, but this group is 
encourage me to do that.” Most of these sub- _ significantly below the controls in mean output | ,, 
jects also stated the implied converse: that no of plurals (Line A). This is a little puzzling, y us 
Umhmm signaled that they should change but a group low in plural output could be ap 
categories. None of the controls reported this _ selected by their failure to form the associative |, 
hypothesis. hypothesis if the Umhmm, which did follow | « 
Question 3, “Was there anything that you plurals often associatively rendered, had any | of 
were supposed to say in order to be correct?,” role in suggesting that the experimenter was | fF 
was comparatively uninformative, mainly be-  jnterested in associations. | we 
cause it had already been answered. Eight of This analysis is obviously post hoc in some ) cig 
the above 11 restated the hypothesis of “rein- sense, but it does not thereby capitalize on | ob 
forcement : for association’ and 3 answered, chance differences. The likelihood of accepting | let 
quite consistently, that no particular words OF achance difference is no greater when separat- f Sa] 
associations were correct. The others replied . |. : baa pI ie 
en wee geist ing subjects on some principle before inspect- | 4 
No” or that if there were a correct response, . : sng mae 
dae dbh ene lenee what 6 was. ing their performance than when separating 
Reports and performance. The experimental subjects on some theoretical hypothesis belaee ) er 
subjects, then, divide most naturally into conducting the experiment. It is only necessary | * 
three groups: “reinforcement for association,” identify the groups in some clear way inde- K 
“associative hypothesis alone,” and ‘no asso- pendent of their performance. There does red Me 
ciative hypothesis.” They are graphed along ain some capitalization on chance, but it is by ee 
with controls in Figure 2. The great rise of the virtue of multiple comparisons. Only thret oth 
“reinforcement for association” group is re- groups are compared with controls, though, ) - 
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Fic. 2. Frequency of plural nouns in blocks of 55 
responses. 


slope is very greatly so.° The theoretical inter- 
pretation of the difference is another matter. 

A post hoc hypothesis. Insum, subjects saying 
nothing that classifies them as “aware” in the 
usual senses show an acquisition effect in 
aggregate; those who say that they are to asso- 
ciate in a series when the experimenter says 
“Umhmm” show a considerable acquisition 
eflect; and the others show none at all. This 
result is unexpected because unlike the “cor- 
rect” hypothesis, the “reinforcement for asso- 
ciation” (RFA) hypothesis has no immediately 
obvious relation to saying plural nouns. But 
let us imagine an experiment in which subjects 
say words while we reinforce in first one seman- 
tic category, then another, tallying as correct 
all responses falling in the semantic category 
that is for the moment reinforced. For the RFA 
subjects this was the perceived pattern of rein- 

‘Two systematic replications are now available, 
though they vary in ways that could have affected the 
number of subjects reporting the RFA hypothesis. In 
one experiment, 8 of 63 report the hypotiiesis; in the 
other, 12 of 35. In both, the RFA subjects perform 
much as do these and differ dramatically from controls. 


In neither experiment do other subjects differ signifi- 
cantly from controls. 
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forcement, and the experimenter defined a 
semantic category as correct just as long as 
plural nouns were forthcoming. It is the tally 
that differed. If a subject of the imaginary 
experiment hypothesized that he was to asso- 
ciate in the category of the moment when the 
reinforcement was “on” and to scan for a new 
category when it was “off,” he should be very 
well able to increase his tally of correct re- 
sponses. Association, we know from common 
sense and an inspection of the Minnesota 
norms (Russell & Jenkins, 1954), tends to pro- 
duce semantically related words; and with in- 
struction to associate in the “same category,” 
the tendency should clearly be great. Now it 
might be that with the same instruction, 
stimulus and response items would tend also to 
be grammatically related—to be of the same 
part of speech or linguistic form class. Words 
that can take the same position in an utterance 
conceivably could be linked in some way that 
would be manifest associatively. The sug- 
gestion is that with instruction to associate in 
the same semantic category, semantic and 
grammatical associative responses might be 
correlated—that the RFA hypothesis is a cor- 
related hypothesis. 

Subjects of the RFA group report that they 
are to associate in a series upon presentation of 
the “Umhmm” and to scan for a new category 
upon its omission. And a reported hypothesis 
should tend to be coupled with some self- 
instructional set. Since the reinforcement in 
this experiment followed plural nouns we would 
then infer an associative set after plurals and a 
nonassociative set after other words. There 
was no report of this kind from controls and no 
reason to expect the same alignment of sets for 
them. In short, it may be that frequency of 
plural response to prior plural is related to an 
associative as opposed to a nonassociative set. 
“Diamonds,” when one continues a series, 
should bring “rubies” or “pearls”—plurals. 
Scan for a new category and “telephone” might 
come as easily as “cigarettes.” This post hoc 
hypothesis, to be tested in Experiment II, 
would make sense of the great increase in 
plural nouns for the RFA group, relative to 
controls, and the operant conditioning of 
plural nouns could be attributed to hypotheses 
which when acted upon are the occasion for 
manifesting prior verbal habits. 

Interpretation of the subjects’ reports. We 
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should first consider a possible objection that 
the questions asked might have suggested the 
reports obtained. The results lose generality if 
the RFA hypothesis would not have been 
formed without the early and repeated ques- 
tion, “What do you think the experiment is all 
about now?” It seems unlikely, though, that an 
adult brought to a small room and asked to say 
words, no phrases or sentences please, would 
not be provoked to wonder as much. And if 
this general question could make one hypothe- 
sis more likely, it probably would have done 
the same for the plural noun hypothesis; yet 
no one called plural nouns correct. Moreover, 
Sidowski (1954) has reported, and others have 
implied, that their subjects adopted question- 
ing, problem solving sets without special in- 
structions. The thought that the RFA hypothe- 
sis could have been suggested entirely by 
questioning at the end is embarrassed by the 
coincidence that only the RFA group shows an 
increase in plurals. Whether or not subjects in 
other studies, and how many, formed the RFA 
hypothesis is, of course, unknown; but, in any 
case, subjects who failed to report that hy- 
pothesis in the present experiment do not repli- 
cate the verbal conditioning effect. 

The present research does not accord un- 
questioned validity to the subjects’ reports or 
in any way rely upon a phenomenological data 
language. The concern is to determine whether 
the subject’s report of a behavioral hypothesis 
is related to response selection in ways that we 
would in theory expect his hypothesis as con- 
struct to be related to response selection. We 
have considered that a behavioral hypothesis is 
related to response selection through a self- 
instructional set. The hypothesis of valid re- 
port is but one more proposition within the 
theoretical network to be evaluated by the 
data. And these propositions, together with the 
post hoc hypothesis, predict the obtained rela- 
tion of report to response selection. Moreover, 
on the common hypothesis that instruction to 
perform a response class may arouse a set to do 
so, it is also important to learn whether the 
relation of instruction to response selection is 
like that found for the corresponding report 
and response selection. The hypothesis of valid 
report will be further supported if the sets to be 
induced by the experimenter’s instructions in 
Experiment IL are found to be related to re- 
sponse selection in the same way found for the 
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reports from which self-instructional sets were 
inferred in Experiment I. In short, the subjects’ 
reports are interpretable if they behave as ex. 
pected from the network of hypothesized rela. 
tions among instructions, hypotheses, sets, re. 
ports, and response selection. The logic of 
science embodied in discussions of a nomologi- 
cal network and construct validity (Carnap, 
1956; Cronbach & Meehl, 1955; Hempel, 1952; 
Sellars, 1948; Spence, 1958) may be appropri- 
ate, not only for the subject’s response to a 
personality questionnaire and extended per. 
sonal history, but also for his response to 
questions and more immediate experimental 
history. 


EXPERIMENT II 


This experiment tests the post hoc hypothesis 
of Experiment I, that frequency of responding 
with plural nouns to plural nouns is related to 
an associative as opposed to a nonassociative 
set. It also tests the corresponding hypothesis 
for Other Words, that frequency of responding 
with Other Words to Other Words is related in 
the same way to those sets, a possibility that 
has implications for Greenspoon’s (1955) re- 
ported conditioning of Other Words. In Experi- 
ment I, subjects hypothesized what was ex 
pected of them; those hypotheses are the basis 
for instructing subjects in what is expected of 
them in Experiment II. By inducing sets in 
this way, the suspected verbal habits might be 
manifest in the absence of verbal reinforce- 
ment. 


Method 


One hundred plural nouns and 100 Other Words 
were drawn at random from the protocols of the first 
experiment and were used as stimulus words in a word 
association test. Within each block of 50 stimulus words 
presented, 25 plural nouns and 25 Other Words were 
randomly intermixed. The subject responded to each 
stimulus word in the first and third blocks of 50 words 
with an associative set and to words of the second and 
fourth blocks with a nonassociative set. Just before 
the first and third blocks of words were presented, the 
experimenter read the following instructions to induce 
an associative set: “This is a study of word habits 
Now listen to each word I read and then tell me the 
first word you think of that is in the same category, 
that is related—that would continue in a series.” To 
induce the nonassociative set prior to the second an¢ 
fourth blocks the experimenter said, “Now listen t 
each word I read and tell me the first word you think 
of that is not in the same category, that is unrelated- 
that would not continue in a series.” The experimenter 
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a female graduate student, read each word aloud and 
recorded the subject’s responses. Subjects were run in 
the same experimental setting as that of Experiment 
I. At the end of the session the experimenter asked, 
“Was there anything you were supposed to say in order 
to be correct?,” and the subject’s responses were tape 
recorded. 

Subjects of the primary sample were seven male 
undergraduates, presumably representative of the 
same population as those of Experiment I. Two others 
were dropped from the experiment when they failed to 
follow instructions, in one case by persistently giving 
simple associations that were not “in a series” (for 
example, blue-bird rather than blue-green or yellow), in 
the other case by failure to shift to a nonassociative 
set. In both cases the session was halted and question- 
ing revealed that the subject had not understood the 
instructions. It is often difficult to judge for a particular 
response whether or not a subject has followed instruc- 
tions; it is not so difficult after a series of responses to 
judge that he consistently does not. An additional five 
high school students and five graduate students in 
psychology were also run as a partial check on the gen- 
erality of the habit. None knew the hypothesis under 
investigation or the identity of the investigator for 
whom the assistant collected the data. When ques- 
tioned none mentioned grammatical categories or re- 
plied that a particular kind of response was correct. 
The graduate students accepted the procedures as an 
assessment of verbal habits, of a character unknown to 
them, and expressed some surprise upon being told 
later that the experimenter was interested in gram- 
matical rather than semantic categories. Most of the 
others submitted themselves to a “personality test.” 


Results and Discussion 


Table 2 presents mean frequency of plural 
and other responses, for both samples, in the 
several experimental conditions. These values 
summarize the effects obtained; tests of signifi- 
cance are based on the analysis of individual 
contingency tables. Phi coefficients and chi 
squares for the relation of frequency of plural 
response to plural stimulus (plural vs. other 
response) with associative vs. nonassociative 
sets are given in Table 3. Yates’ correction was 
applied to the phi formula when expected fre- 
quencies were less than five. All phi coefficients 
from the undergraduate subjects are positive 
and all converted chi squares are highly signifi- 
cant. Data from the high school and graduate 
students (Table 3) are consistent with this. 
Contrary to common misunderstanding of the 
“independence” requirement, phi and chi 
square are appropriately computed upon a 
single individual’s responses. When significant 
they imply replicability for a particular sub- 
ject. Generality across subjects can be inferred 
tom 16 individual replications, which, ignoring 
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TABLE 2 


MEAN FREQUENCIES OF PLURAL RESPONSE AND OTHER 
RESPONSE TO PLURAL STIMULI AND OTHER STIMULI 
UNDER ASSOCIATIVE AND NONASSOCIATIVE SETS 








| Plural stimuli | Other stimuli 























Subjects ‘in | Nonas- | Asso- | Nonas- 
ciative — | ciative — 
set set | set set 
| 
College undergraduates | 
Plural 32.3 | 11.6) 1.7] 4.7 
Other R 17.7 | 38.4 | 48.3 | 45.3 
High school and gradu- 
ate students 
Plural R 35.8 | 13.2 | 3.6] 6.4 
Other R 14.2 | 36.8 | 46.4 | 43.6 
All subjects 
Plural R 34.4) 12.5] 2.9] 5.7 
Other R 15.7 | 37.5 47.3 | 44.3 
TABLE 3 


ASSOCIATION OF PLURAL RESPONSE AND OTHER 
RESPONSE WITH ASSOCIATIVE AND NONASSOCIATIVE 
Sets FOR PiurAL STIMULI AND OTHER 
STIMULI 





Other stimuli 





Plural stirauli 





























Subjects lel w |e<| ¢ |x \p< 
| | 
noe a on 8 ee 
College undergraduates | 
| 7] 7.3 | .O1 05 | 0.3) 
2 40] 16.2") .001 | .09 | 0.9 
3 | .40] 15.8") .001 | .19 | 3.4) .10 
4 | 64) 41.2 | .001 | 13 | 1.6 
s | .65| 42.7 | .001 | .08 | 0.7 
a | 28] 8.1] or | .16 2.6| 
7 | -33| 10.9 | 001 | .07 | 0.5) 
High school and graduate | | } | 
students | | 
1 HS 17} 3.0 | .10 | .05 | 0.3} 
2 HS 22} 4.8) 05 | 11 1.3 
3 HS .42/ 17.4] .001 | 41 | 1.3 
4 HS | 31] 9.7 | 005 | .19 | 3.5} .10 
SHS -56| 31.6 | .001 |—.08 0.7) 
6G .70 49.6 | -001 | .05 0.3) 
7G -$2} 26.5 | 001 | .15 | 2.3) 
8G | 38) 14.4| 001} .00] | 
9G | .82| 67.1 | .001 | .09 | 0.9) 
10G -57| $2.6 | 001 | .23 | 5.4) 02 








* Phi’s to 4 decimal places, which differed before rounding, 
were used in the computation of chi square. 


size of effects, has a chance probability of 2-”. 
The results clearly support the post hoc hy- 
pothesis of a relation of plural response to 
plural stimulus with a set to associate in a 
series as opposed to a nonassociative set. They 
suggest a basis for the increase in plural nouns 
of the RFA group in Experiment I. 

Table 3 also presents analogous information 
for the association of frequency of Other re- 
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sponse to Other stimulus with the two induced 
sets. The tests reported come from contingency 
tables representing the association of other re- 
sponse vs. plural response with associative vs. 
nonassociative sets. These phi’s tend toward 
zero or low positive relation. Taking both 
samples, only two of the chi squares for this 
relation have p values lower than .10, and only 
one lower than .02. But 15 of the 17 phi’s are 
positive, and by the sign test this outcome has 
a chance probability of less than .01 (two- 
sided). The data at least support the possi- 
bility that hypotheses and associative habits 
could have mediated the operant conditioning 
of Other Words reported by Greenspoon 
(1955). None of these effects is large, but then 
Greenspoon (1955) reports a comparatively 
weaker conditioning effect for Other Words— 
an F at the .05 level for the three conditions 
Umhmn, Control, and Uh-huh. 

The latter contingency tables can, of course, 
just as well be interpreted as showing a positive 
relation between plural response to Other 
Words and a nonassociative set. Viewed in this 
way, they suggest that scanning for a new 
category after Other Words, reported by 
subjects of Experiment I, might also have in- 
creased the tendency toward production of 
plural nouns. 

Since categorization may follow many pos- 
sible attributes, it would be difficult for an ob- 
server to judge with great accuracy whether or 
not the subject’s responses were in every 
instance “‘in the same category” as the subject 
would categorize. However, it was only neces- 
sary to observe that the associative and non- 
associative instructions were generally fol- 
lowed in order to interpret the data of Table 3 
as evidence for the correlation of two response 
classes, responses matched in grammatical 
category to stimulus and responses matched in 
semantic category to stimulus, 

The verbal association experiment permits 
greater control over the manipulation of sets, 
as well as continuity with a large literature. It 
does not permit a clear comparison of the 
magnitude of effects with that of the free re- 
sponse procedure. Nevertheless, the procedure 
demonstrates the kind of effect hypothesized 
and shows it to be appreciable under fairly 
conservative conditions. The number of stimuli 
is fixed in Experiment II, and it does not pro- 
vide for cumulative effects of the type possible 


in Experiment I. Moreover, in the free re. 
sponse procedure, the associative response js 
often given after a series of two or more plurals 
and the availability of another plural ought to 
be even greater in that case than after a single 
plural. The hypothesis only requires that we 
find the same kind of relation of instructional 
sets to behavior that we have found for 
reports and behavior, reports from which self. 
instructional sets were inferred. The quantita- 
tive effects of instructions uniformly adminis. 
tered and self-instructions irregularly emerging 
are not fairly compared in any obvious way. 

It is worth noting that the grammatical as- 
sociative habits transferred without reports by 
the subjects that they were supposed to pro- 
duce plurals or Other Words on plural and 
Other stimulus. This finding agrees with the 
common view that transfer may occur without 
awareness even if acquisition may not. And 
finally, the experiment again calls to mind (cf. 
Watt, 1905) the importance of instructional 
sets in the selection of associative habits to be 
manifest. 

GENERAL DISCUSSION 

Experiment II supports the interpretation 
placed on Experiment I. If RFA subjects acted 
on the hypotheses they reported, they should 
have increased their output of plural nouns. An 
associative set after plural nouns should have 
increased the rate of plural nouns, and 
should have a nonassociative set after Other 
Words, though much less so. Taken together, 
the experiments suggest a mechanism of verbal 
control that may have generality. Subjects 
may hypothesize what is expected of them and 
instruct themselves to act accordingly. Acting 
accordingly may constitute a response or oper- 
ation as relevant cues are encountered, and the 
execution of correct responses as the direct 
manifestation of prior habits. In Experiment I, 
hypotheses are reported, and we infer cor- 
responding self-instructional sets to associate in 
series on cue of the reinforcement and to scan 
for a new category on cue of its omission. With 
this alignment of sets, to judge from Exper'- 
ment II, grammatical associative habits are 
manifest in plural response to prior words 
uttered. 

Relation to other findings. If impressive ev- 
dence of verbal operant conditioning without 
awareness, beyond the effect of Umhmm 00 
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plural nouns and Other Words, remains un- 
touched by the implications of this study, then 
the present findings of course have less signifi- 
cance. Sidowski (1954) reports that a light was 
efective reinforcement for the conditioning of 
plural nouns and Greenspoon (1951) reports 
the same for both a light and a buzzer. Umhmm 
rather than a light or buzzer was used in the 
present study not only because it has been used 
more often, but also for the greater generality 
of any finding that the contingent stimulus 
acts as a Cue, a source of hypotheses and an 
occasion for acting on them, rather than as a 
gratifying social acknowledgment that rein- 
forces without awareness. A light or a buzzer 
ought to be a less gratifying acknowledgment 
and a more figural cue. 

Greenspoon (1955) also found that Huh-uh 
decreased the frequency of plural nouns. 
Though plausible, it would strain a point to 
suppose that Huh-uh could have signaled 
“change categories,” a hypothesis which Ex- 
periment II suggests would depress the fre- 
quency of plurals. In any case, the significant 
eflect reported comes from an F for three con- 
ditions—Umhmm, Control, and Huh-uh—and 
the significance of the variance might be due to 
the effect of Umhmm: the Umhmm group 
differs by ¢ test from controls at all four rein- 
forcement blocks, the Huh-uh group at only 
one block. Wilson and Verplanck (1956) add 
that Umhmm or Good increased the rate of 
saying adverbs in six of seven cases. Though it 
should be documented, analogous adverb-to- 
adverb association habits are a strong possi- 
bility, and given the same task used in the 
present study, their subjects seem no less likely 
to form the RFA hypothesis. Furthermore, 
controls were not used and the overall decline 
in frequency of plural nouns produced by con- 
trols here suggests that adverbs might increase 
inrate without reinforcement. A striking cumu- 
lative curve for frequency of pecking with 
application of a contingent stimulus is com- 
pelling evidence for the functional control of 
that stimulus. A significant shift in output of 
many human response classes may not be. 

Where areas of semantic content such as 
“travel words” or “living thing words” are re- 
inforced (Wilson & Verplanck, 1956) the RFA 
hypothesis should be no less likely, and a set to 
associate in the same category should obviously 
produce more words in that category. Wilson 
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and Verplanck do acknowledge that many of 
their subjects stated “correlated hypotheses” 
(see Adams, 1957), for example, “geographic 
locations” for “travel words,”’ but they report 
that these subjects performed no differently 
than did subjects reporting nothing of rele- 
vance. Krasner (1958) and Adams (1957) have 
reviewed a number of other studies, employing 
procedures other than the free operant pro- 
duction of single words, in which learning with- 
out awareness is reported. Whatever the signifi- 
cance or deficiency of these studies, they are 
not discussed here beyond observing that both 
reviewers question the adequacy of criteria for 
awareness and the exclusion of correlated hy- 
potheses. 

The question of learning wiihout awareness. 
Does Experiment I show learning without 
awareness? Even to ask the question requires 
that we specify some conception of awareness. 
Certainly there are many objects of awareness 
and the question is not whether unconscious 
subjects learned. There is no obvious solution 
in the literature, where investigators describe a 
number of awarenesses their subjects are said 
to learn without. But the matter should not be 
entirely arbitrary if we are interested in other 
than learning without this-that-and-the-other 
awareness variously revealed. 

Questions of theoretical significance require 
that we specify some awareness that is theo- 
retically relevant. First of all, we wish a theo- 
retical term for awareness that will enter into 
theoretical propositions relating it to response 
selection. A “‘correct behavioral hypothesis” — 
the hypothesis that a correct response class is 
correct—is such a term. It is related to response 
selection within the theoretical network de- 
scribed. Verbalization of the contingency of the 
reinforcement on the correct response class, the 
most favored conception, does not so clearly 
predict selection of correct responses because 
it may leave the subject wondering what 
Umhmm was for and what he was supposed to 
do. Nor is there a clear inference of verbal 
antecedents in any number of acceptable state- 
ments of the purpose of an experiment or a re- 
port that learning occurred. 

But the matter is additionally complicated 
by Adams’ (1957) important suggestion that 
“correlated hypotheses” might account for 
many of the effects reported. By “correlated 
hypothesis” Adams seems to mean any aware- 
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ness other than of the correct response class 
that could account for better than chance per- 
formance. But this suggestion is not useful 
enough for prediction and too useful for post 
hoc explanation. The concept begs redefinition 
by some other specifiable relation to the cor- 
rect response class. Let us consider instead that 
a hypothesis is a “correlated hypothesis” if the 
response Class it calls correct is correlated with 
the response class the experimenter calls cor- 
rect. Any hypothesis naming some incorrect 
response class as correct may be related within 
the theoretical network to selection of the re- 
sponse class it names. If that response class and 
the correct response class are correlated, then 
such an hypothesis becomes an antecedent 
from which to predict selection of the correct 
response class. To identify a correlated hy- 
pothesis, we might instruct the occurrence, 
then nonoccurrence, of the response class it 
calls correct, and observe a correlation of 
presence and absence of those responses with 
presence and absence of the responses the 
experimenter calls correct. Or where the 
hypothesized response class is identified with 
limited reliability, as is “associating in a series,” 
we might observe a correlation of those in- 
structions with presence and absence of mem- 
bers of the correct response class. This alterna- 
tive follows on the hypotheses that instruction 
may induce the hypothesis-related set and re- 
sponse class. On the evidence of Experiment II, 
then, “I am supposed to associate in a series 
when you say ‘Umhmm’ ”’ was identified as a 
correlated hypothesis for “I am supposed to 
say plural nouns.” 

Thus, either a correct or correlated hypothe- 
sis—as redefined—should in theory lead to 
selection of the correct response class, and to- 
gether they provide a conception of awareness 
relevant to a second theoretical concern. Since 
“learning without awareness” is commonly 
interpreted as supporting a theory of response 
selection by automatic action of aftereffects 
(Thorndike, 1933), we should also like a con- 
ception of awareness relevant to that theory. 
The classical theory of automatic strengthen- 
ing, unqualified by further assumptions, clearly 
implies that learning should occur in the 
absence of either a correct or a correlated hy- 
pothesis. Any such evidence would tend to 
support that theory as opposed to some alterna- 
tive of verbal control. However, with report of 
a correct or correlated hypothesis as the indi- 


cator of awareness, Experiment I presents no 
evidence for learning without awareness. 

But is there a way to recognize that learning 
occurred only with reports of awareness and 
still hold that it occurred by automatic 
strengthening? The problem is to account for 
the relation of reports to performance in a way 
consistent with that theory. ; 

1. Could the hypotheses and report some- 
how be in response to the increased usage of 
plural nouns or to the process of automatic 
strengthening itself? Perhaps subjects rational- 
ize their plural output as “associating in a 
series after ‘Umhmm,’ ” or the hypothesis and 
report might be a phenomenal and verbal 
emergent of the process of automatic strength- 
ening. These assumptions lack plausibility and 
generality. With no principle to link selection 
of plurals with this particular hypothesis and 
report, we are left to wonder why subjects did 
not report the opposite, or what a subject 
would report if we automatically strengthened 
his adjectives or ear-tugging. 

2. There is a more serious variant of this set 
of assumptions. Suppose that whatever re- 
sponse class is automatically selected, a hy- 
pothesis or report of ‘hat response class follows 
as a rationalization or as a phenomenal or 
verbal emergent. And suppose it was associat- 
ing in a series after Umhmm that was the re- 
sponse class automatically strengthened. This 
is probably more consistency than a theory of 
rationalization can manage—would all subjects 
who learn need to rationalize their behavior? 
Or it would require radical augmentation of a 
classical theory of automatic strengthening. To 
assume that automatic strengthening entails 
phenomenal or verbal emergents is, in fact, a 
new and radical theory, and should be brought 
forward strongly, if at all, so that it might be 
evaluated. 

3. Most appealing is the thought that both 
awareness—the RFA hypothesis—and auto- 
matic strengthening of “associating in a series” 
might be consequences of the reinforcement. 
Perhaps learning occurred only with awareness 
because enough reinforcement to produce 
learning is enough to produce awareness. But 
this assumption must be supported with others. 
Number of reinforcements was not system- 
atically varied, but programed alike for all, and 
those who learned show no initial advantage at 
the first block which could bring them more 
reinforcements thereafter (Mrra = 5+ 
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Mother = 5.9). Still we might assume that the 
Umhmm was at the same time made more re- 
warding and more figural for some, and those 
subjects learned and reported the correlated 
hypothesis. Or some unknown kinds of indi- 
vidual receptivities to reinforcement and dis- 
positions to awareness might be correlated. 

Explanations 2 and 3 would still require an 
hypothesis of correlated response classes in 
order to account for the coincident selection of 
plural nouns. Explanation 2 does not suggest 
why some learned automatically and some not. 
Neither explanation would account for other 
findings that some subjects report a nonrein- 
forced response class as correct and in fact 
select it (Dulany, 1960). And, of course, they 
leave the effects of instruction in Experiment 
Il to be explained in some other way. In sum- 
mary, a theory of automatic strengthening by 
aftereffects apparently does not account for 
these findings without auxiliary assumptions or 
radical augmentation. A little thought will 
usually supply assumptions to brace a chal- 
lenged theory and it is rarely disproved. But 
the result does not seem a better theory than a 
theory that subjects report processes prior to 
and instrumental in response selection. It 
seems better to interpret the results within a 
network of simple hypotheses that appear 
sufficient to account for our findings, and that 
bring within the same formulation the com- 
mon relation of reports and of instruction to 
response selection. 

The question of operant conditioning. Neither 
reports of behavioral hypotheses nor the 
mechanism of verbal control outlined here enter 
into the language of operant conditioning as 
described by Skinner (1938) and extended by 
many (for example, Salzinger, 1959) to the 
verbal conditioning literature. Operant con- 
ditioning is said to occur when rate of response 
is brought under the functional control of a 
contingent stimulus (Skinner, 1938, 1953). 
How well do empirical operant principles de- 
scribe the data of subjects in Experiment I? 
There are, of course, many possible abstractions 
upon the data, and the fit of empirical operant 
principles is no less good for uncovering medi- 
ating mechanisms. A free operant came under 
the functional control of a contingent stimulus. 
Empirical operant principles will describe the 
performance of subjects with the RFA hy- 
pothesis and loosely fit the aggregate behavior 
of the total experimental group. But the de- 


scriptive language of operant conditioning gives 
no account of the difference in performance of 
those reporting and those not reporting the 
critical hypothesis. With attention to the sub- 
jects’ reports there is less mystery that some 
subjects “condition” and some do not. If there 
are other resources within Skinner’s (1953, 
1957) extended system to account for such 
findings, an accumulation of similar findings 
would provide a challenge to call upon them. 
No formulation is obliged to account for ir- 
relevant findings or to answer irrelevant 
questions. But verbal conditioners, too, ap- 
parently see awareness and possible verbal 
control as critical to their position, and it is 
they who have raised the question. A lack of 
awareness has been set as a condition for 
specifying the laws relating empirical rein- 
forcers and response classes (Verplanck, 1955, 
p. 597). And most investigators are at pains to 
report that verbal conditioning occurred with- 
out, or unrelated to, reports of some kind of 
awareness. The languages of operant con- 
ditioning and verbal control are alternative 
formulations in the sense that each may have 
heuristic value. Wherever, there is evidence of 
a relation of reported behavioral hypotheses or 
of behavioral instructions to response selection, 
they would not seem to be equally adequate 
accounts of the data. 

Still to be considered is the point that, apart 
from the question of awareness or verbal con- 
trol, operant conditioning procedures have 
been shown to “identify” plural nouns as a re- 
sponse class. Identification of response classes 
is, of course, a clearly stated and fundamental 
aim of operant analysis, and in the verbal 
operant conditioning literature this identifica- 
tion is made through the manipulation of a 
contingent stimulus. Verplanck (1955) re- 
phrases Skinner’s (1938) definition of a re- 
sponse class in this way: “...a part of be- 
havior (a) that is recurrently identifiable and 
hence enumerable; and (b) whose rate of oc- 
currence can be determined as a systematic 
function of certain classes of environmental 
variables” (p. 595). Salzinger (1959) concurs. 
By the definition given, however, plural nouns 
ought to be identified as a response class when 
subjects are instructed to utter them and do 
so—voluntarily. That they will is easily shown, 
and this method of identification has been a 
routine part of exploratory work preceding this 
and other of our experiments on the selection 
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of various response classes (Dulany, 1960). It 
clearly answers the question whether a col- 
lection of possible responses will function as a 
behavioral variable by showing their rate of 
occurrence to be a function of presence and 
absence of instruction. 

The empirical relation of plural nouns to a 
contingent stimulus is interesting and valuable 
from many standpoints. But though manipula- 
tion of a contingent stimulus is a sensible re- 
course with prelinguistic organisms, for the 
many human response classes accessible to 
awareness and capable of verbal control, in- 
struction should be a simpler and more fruitful 
way of identifying functional response classes. 
We should not then have to depend upon the 
emergence of awareness or, alternatively, upon 
a fragile and disputable finding of ‘‘condition- 
ing without awareness.” The unsettled em- 
pirical question is whether response classes 
identified so simply will behave as response 
classes under selective reinforcement in the 
absence of awareness and verbal control. The 
present experiments offer no assurance that 
plural nouns will. 


SUMMARY 


It is not certain that “verbal operant con- 
ditioning” occurs without awareness or is ade- 
quately described in the language of operant 
conditioning. 

In the first experiment, subjects said words, 
and plural nouns were reinforced with Umhmm. 
In the aggregate, subjects showed a significant 
shift to plural nouns when compared with 
controls. Upon questioning, approximately 
25% of the experimental subjects reported the 
hypothesis that whenever the experimenter 
said ‘“‘Umhmm”’ they were to associate in series 
and that no acknowledgment meant that they 
were to change semantic categories. This group 
produced a highly significant acquisition effect, 
and the remaining subjects none. With moti- 
vated subjects these hypotheses should be ac- 
companied by self-instructional sets. For the 
successful experimental subjects, then, we infer 
that a “series associative set’’ tended to follow 
plural nouns and a “‘nonassociative set” tended 
to follow Other Words. None of the controls 
reported the critical hypothesis. If response 
after plural nouns brings more plural nouns 
with a set to continue a series than with a set 
to find a new category, the present verbal con- 


Don E. Dutany, Jr. 


ditioning effect might be ascribed to the medi- 
ation of hypotheses, sets, and the transfer of 
prior verbal habits. 

Experiment II presented a word association 
test with verbal reinforcement excluded. Fre- 
quency of plural nouns in response to plural 
nouns was significantly associated with a set 
to associate in series as opposed to a set to 
change categories. Other response to Other 
Words showed the same, though weaker rela- 
tion to these sets. 

The paper discusses the implications of these 
findings for some of the reports of verbal 
operant conditioning, for the question of learn- 
ing without awareness, and for the adequacy 
of operant conditioning principles to describe 
the data obtained, making the following 
principal points: 

1. The “conditioning” of plural nouns in 
the present study was mediated by a mecha- 
nism of verbal control. Subjects may hypothe- 
size that some response class is correct and 
instruct themselves to respond accordingly. 
Response to relevant cues as encountered 
may yield correct responses as the manifesta- 
tion of prior habits. 

2. This mechanism provides a possible ac- 
count of the verbal conditioning effects ob- 
tained in several often-cited experiments. 

3. The concept of “correlated hypothesis” 
requires further explication lest it become too 
convenient a post hoc explanation. To identify 
a correlated hypothesis we may instruct the 
occurrence and nonoccurrence of the response 
class it calls correct and observe a correlation 
of these instructions (or of the instructed 
response class) with presence and absence of 
members of the response class the experi- 
menter calls correct. 

4. The question of learning or conditioning 
without awareness is made more meaningful 
by specifying an awareness more clearly 
relevant to theory. On common theoretical 
propositions, one kind of awareness—a correct 
or a correlated hypothesis—should lead to 
selection of the correct response class. And 
a theory of automatic action of aftereffects 
implies that learning should occur in the ab 
sence of that kind of awareness. 

5. With report of the correct or a correlated 
response class as the criterion of awareness, 
Experiment I presents no evidence for learning 
or conditioning without awareness. 
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6. The language of operant conditioning, as 
it has been extended to the verbal conditioning 
experiment, does not describe the differences 
in performance of those reporting and those 
not reporting the critical hypotheses. 

7, The usefulness of operant conditioning 
procedures for “identifying” many functional 
response Classes at the human level should be 
evaluated against the relative ease of identi- 
fying response classes by using instructions. 
"8, The methodological strategy illustrated 
by these experiments—the joint use of reports 
and instructions—may be useful in the analysis 
of verbal control in other learning and condi- 
tioning paradigms. 
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SOCIAL COMPETENCE AND OUTCOME IN PSYCHIATRIC DISORDER: | 2 


‘ 

EDWARD ZIGLER AND LESLIE PHILLIPS* , 
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roM Kraeplin onward, prognosis has _ propriate time interval in the assessment of | I' 

been a central problem in psycho- outcome, control of significant variables not P 
pathology. The elusive resolution of this employed as predictors, etc. A comprehensive | L 
problem is equaled in the clinical sphere only list of such methodological difficulties, ac- I 
by that of the unsettled etiology of the func- companied by a warning concerning the need 1 
tional disorders. The comparable lack of for their effective handling, was presented by hi 


knowledge concerning prognosis and etiology Malamud and Render (1939). Evidence that } 
may be consequent to an inherent relation this warning failed to deter the production of |v 


between the outcome of a disorder and the a number of methodologically unsound studies fi 
nature of the disorder. Apart from such theo- can be found in reviews by Zubin and Windle tN 
retical considerations, the problem of prognosis (1954) and Huston and Pepernick (1958), ce 
is also of concern to the practitioner because These authors, writing more than 15 years ci 
of its obvious practical import. In view of the after Malamud and Render, advance many ) a 
limited resources of the typical hospital setting, of the same methodological criticisms noted of 
the ability to predict outcome successfully _ earlier. fa 
would be of considerable aid in making ad- A number of studies have attempted to I‘ 
ministrative decisions and optimally utilizing conform to sound methodological require- \ 
treatment resources. ments, but they have lacked a theoretical in 

A number of reviewers have examined the framework (Bayard & Pascal, 1954; Cole, dr 
literature concerned with prognosis (Bellak, Swensen, & Pascal, 1954; Dunham & Meltzer, ne 


1948; Blair, 1940; Chase & Silverman, 1941; 1946; Feldman, Pascal, & Swensen, 1954; 
Huston & Pepernick, 1958; Langfeldt, 1937; Malamud & Render, 1939; Pascal et al., 1953; ga 
Mayer-Gross & Moore, 1944; Phillips, 1949; Schofield et al., 1954; Swensen & Pascal, he 














Zubin & Windle, 1954). Certain investigators 1954a, 1954b). This empirical emphasis char- pr 
(Pascal, Swensen, Feldman, Cole, & Bayard, acterizes the literature on prognosis and has th 
1953) have been impressed with the plethora been especially evident in the search for prog- in 
of studies done on prognosis, while others nostic predictors among case history data. vi 
(Schofield, Hathaway, Hastings, & Bell, 1954) Although the present authors are impressedin ) In 
have been distressed by the dearth of such principle with the merits of the actuarial Ki 
inquiries. There appears to be almost universal approach (Mechl, 1954), such empiricism has vil 
agreement, however, (Huston & Pepernick, in this instance given rise to a multitude of Ph 
1958; Langfeldt, 1937; Malamud & Render, diverse and incongruent findings. Inconsistent th 


1939; Mayer-Gross & Moore, 1944; Pascal and frequently contradictory findings have of 
et al., 1953; Schofield et al., 1954; Zubin & been reported on the relationship of prognosis , 9 





Windle, 1954), that studies done in this area to such variables as age (Chase & Silverman, to 
have produced remarkably few definitive 1941; Dunham & Meltzer, 1946; Guttmann, | th 
findings. One reason for the lack of such Mayer-Gross, & Slater, 1939; Huston & in 
findings is the failure to meet the methodologi- Pepernick, 1958; Langfeldt, 1937; Malamud . 
cal problems involved in prognostic research, & Render, 1939; Pascal et al., 1953; Rennie, th 
e.g., operational definitions of both predictor 1939; Rupp & Fletcher, 1940; Schofield et al, | su 
and criterion variables, utilization of an ap- 1954; Silverman, 1941; Stalker, 1939), marital ) in 

status (Chase & Silverman, 1941; Dunham att 
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nick, 1958; Kant, 1944; Pascal et al., 1953; 
Shofield et al., 1954; Stalker, 1939), and 
education and intelligence (Bellak, 1948; 
Bowman & Raymond, 1929; Chase & Silver- 
man, 1941; Dewan, 1948; Dunham & Meltzer, 
1046; Fellner & Weil, 1955; Huston & 
Pepernick, 1958; Jacob, 1940; Langfeldt, 1937; 
Lewis & Blanchard, 1931; Malamud & Render, 
1939: Malamud, Sands, Malamud, & Powers, 
1949: Pascal et al., 1953; Rennie, 1939, Scho- 
field et al., 1954; Stalker, 1939). In view of the 
observed ambiguity of the relation of these 
variables to prognosis, it is not surprising to 
find that many investigators follow either of 
two paths: They draw their conclusions con- 
cerning the importance of such variables by 
simply counting the observations pro and con, 
or they completely discount the significance 
of such variables as predictors of outcome in 
favor of more dynamic considerations (Blair, 
1940; French & Kasanin, 1941; Strecker & 
Willey, 1928; Sullivan, 1928). In sum, little 
in the way of reliable conclusions can be 
drawn from the extensive literature on prog- 
nosis. 

Continued piecemeal and empirical investi- 
gation of case history items offers little of 
heuristic value toward an understanding of 
prognosis. What appears to be needed is a 
theoretical framework which can meaningfully 
include such biographical items and thus pro- 
vide them with a conceptual foundation. 
In a series of studies reported earlier (Phillips, 
Kaden, & Waldman, 1959; Phillips & Rabino- 
vitch, 1958; Phillips & Zigler, 1961; Zigler & 
Phillips, 1960, 1961), the present authors and 
their colleagues have suggested the central role 
of personal and social maturity and have made 
systematic use of a developmental approach 
to psychopathology. We propose that this 
theoretical orientation can be fruitfully applied 
in the study of prognosis. 

Central to the developmental approach is 
the view that the individual progresses through 
successive stages or levels of maturity and that 
individuals differ in the final level of maturation 
attained. At each level of maturation, society 
presents a complex of tasks with which the 
ndividual may cope successfully or deal with 
happropriately. That is, for every maturity 
evel there is a normal pattern of adaptation 
$ well as a pathological deviation from this 
attern. Psychopathology then represents 


various forms of inappropriate resolution. The 
various pathologies (syndromes) may be con- 
ceptually ordered and viewed as representing 
inadequate resolutions at successive stages of 
social maturation (Phillips, 1960; Zigler & 
Phillips, 1960, 1961). 

This position negates the view that 
normality is the absence of pathology and 
suggests instead that pathology can best be 
understood within the context of normal 
maturation. Thus it places major emphasis on 
the adaptive potential of the individual. The 
major implication of this position is that 
remission from the various pathologies is repre- 
sented by the establishment of a successful 
resolution of the individual’s adaptive diffi- 
culties at a level appropriate to him rather 
than at some ideal end-state. That is, normality 
per se is not considered identical to the highest 
level of maturity. Increasing maturity, how- 
ever, should allow the individual to bring 
greater resources to bear on the mastery of 
those tasks set by society. Thus in pathological 
individuals a higher degree of maturity should 
imply a greater potential for undoing in- 
appropriate solutions to these tasks. Therefore 
maturity level should bear a definite relation 
to prognosis. 

The present investigators feel that it is pre- 
mature to attempt the construction of a com- 
prehensive index of maturity. Since a measure 
of this sort is lacking, they have employed 
social competence, defined by the variables of 
age, intelligence, education, occupation, em- 
ployment history, and marital status as an 
approximation of personal and social maturity. 
The concept of social competence differs from 
the narrower constructs of social class, skill 
level, or social adjustment. While it may sub- 
sume these narrower measures, it is most 
appropriately viewed as a broad, multidefined 
variable which reflects the developmental level 
of the individual, i.e., the level of maturation 
attained. Evidence that social competence 
is a valid indicator of the individual’s over-all 
level of maturity may be found in several 
studies (Grace, 1956; Lane, 1955; Smith & 
Phillips, 1959) that have reported a positive 
relationship between social effectiveness and 
often employed perceptual indices of develop- 
mental level. Earlier studies have confirmed 
the investigators’ position that a relationship 
exists between social competence and the 
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incidence and form of mental disorders 
(Phillips & Zigler, 1961; Zigler & Phillips, 
1960, 1961). It is the purpose here to test 
the proposition that prognosis is also related 
to social competence. 


Stupy I 
Method 


Subjects. This study was based on the examination 
of the case histories of 251 patients admitted to Wor- 
cester State Hospital during a 9-year period (1945 
1954). The particular case histories employed were 
those of patients who had no previous admissions to a 
psychiatric hospital, were referred to the hospital psy- 
chology department for psychological appraisal, were 
diagnosed as suffering from a functional disorder, and 
were committed to the hospital following the 40-day 
observation period. Although this sample includes a 
wide variety of! cases, extremely deteriorated or very 
agitated patients are seldom referred for psychological 
evaluation and are not adequately represented in this 
sample 

Social competence score. The variables of age, intelli- 
gence, education, occupation, employment history, and 
marital status were used as social competence indices. 
Each variable was divided into three categories with 
each category conceptualized as representing a step 
along a social competence continuum. The rationale em- 
ployed for ordering the categories of each variable was 
the same as that used earlier (Zigler & Phillips, 1960). 
The categories of each variable and their order from low 
to high are presented below: 

1. Age: 24 and below; 25-44; 45 years and above. 

2. Intelligence: IQs obtained on a standard intelli- 

gence test of 84 or less; 85-115; 116 and above. 

3. Education: none or some grades including un- 
graded or special classes; finished grade school, 
some high school, or high school; some college or 
more. 

4. Occupation: The Dictionary of Occupational Titles 
(United States Government Printing Office, 1949) 
was employed to place each occupation into the 
categories of unskilled or semiskilled, skilled and 
service, or clerical and sales or professional and 
managerial 

5. Employment history: usually unemployed; sea- 
sonal, fluctuating, frequent shifts, or part-time 
employment; regularly employed. 

6. Marital status: single; separated, divorced, re- 
married, or widowed; single continuous marriage. 

The case history of each individual was examined in 
order to assign him a score on each of the six social com- 
petence indices. For any of the indices, placement in 
the lowest category resulted in being given a score of 0 
for that index. Similarly, assignment to the middle cate- 
gory resulted in a score of 1 and to the highest category, 
a score of 2. The over-all social competence score for 
each person was the mean of the scores obtained on the 


individual indices. This averaging procedure was re 


quired because data were not available for every indi 
vidual on all six variables. Thus the final social com- 
petence score for any patient could range from 0 (the 
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lowest category on every variable for which informa- 
tion was available) to 2 (the highest category on every 
variable for which information was available). The dis. 
tribution of social competence scores of the 251 patients 
was cut as close as possible to the median to form High 
and Low social competence groups. 

Outcome measures. The outcome measures employed 
were length of institutionalization; following discharge 
from the hospital, the patient’s readmission to any 
mental hospital in Massachusetts; and for those pa- 
tients readmitted, the time interval between discharge 
from the first admission and the second admission. 

Although these measures have the advantage of being 
operational, they can only be considered gross indi- 
cators of prognosis. The length of hospitalization meas- 
ure is especially vulnerable to contamination because of 
its extreme sensitivity to changes in the administrative 
policies of the hospital. The shortcomings of this meas- 
ure are balanced somewhat by the readmission meas- 
ures. These latter indices appear to be more sensitive 
indicators of an individual’s adaptive resources. 


Results 


The High and Low competence groups were 
initially compared on the number of patients 
in each group who were still in the hospital as 
opposed to the number who had been released, 
The results of this 2 X 2 contingency table 
(x? = 4.52, p < .025) indicated that a higher 
proportion of patients in the Low as compared 
to the High group had remained in the hospital. 
(One-tailed tests of significance were used 
throughout this investigation whenever tests 
were made of the general prediction that an 
individual’s premorbid social competence was 
positively related to outcome in psychiatric 
disorder. ) 

A second evaluation of the relationship 
between social competence score and length of 
institutionalization was computed. The dis- 
tribution of scores on this latter measure was 
divided at the median, and a 2 X 2 contingency 
table was set up. The findings (x? = 11.26, 
p < .001) indicated that patients in the High 
group remained in the hospital for a shorter 
length of time than did patients in the Low 
group. In order to assess how much of this 
difference was due to the time scores of indi- 
viduals who remained in the hospital, this 
analysis was redone, excluding such patients 
from consideration. This analysis revealed 
that of the patients who had been released 
from the hospital, those in the High group had 
been released after a shorter time than those 
in the Low group (x? = 7.53, p < .005). 

An analysis was then performed on the 
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number of patients in each of the groups who 
had been rehospitalized versus the number who 
had remained out. This analysis (x? = 4.00, 
» < .025) revealed that a larger proportion of 
the Low as opposed to the High group had 
been rehospitalized. The final analysis was 
conducted on the relationship of social compe- 
tence to the time interval between discharge 
and readmission. The distribution of these 
time intervals was split at the median, and a 
social competence group X time interval con- 
tingency table was set up. The findings associ- 
ated with this table (x? = 1.52, p < .20) 
gave no evidence of a significant relationship 
between premorbid social competence and the 
length of time between release and readmission 
for patients readmitted to a hospital. 
Stupy Il 

The findings reported above indicate that 
patients who have manifested a relatively good 
level of social competence are more likely to 
have been released from the hospital, spend a 
shorter period of hospitalization, and are less 
likely to be rehospitalized than patients whose 
premorbid history was poor. However, the 
nature of the relationship between social 
competence and outcome is still not clear. As 
reported earlier (Zigler & Phillips, 1961), an 
individual’s social competence is related to 
the diagnosis he receives. Therefore, the dif- 
ferences in outcome for the two competence 
groups may be due to differences in their 
diagnostic composition rather than being 
directly related to social competence. 

Furthermore, patients high on the particular 
indices employed to measure social compe- 
tence, e.g., intelligence, may receive more 
intensive psychiatric treatment. It could be 
argued that the better outcome in the High 
competence group was the result of such 
superior treatment. A final question can be 
raised concerning the purity of each of the two 
groups on the competence variable. An earlier 
study (Zigler & Phillips, 1961) revealed that 
all hospitalized patients tend to be drawn 
from the less socially competent segment of 
the population. The averaging procedure em- 
ployed in Study I and the nature of the dis- 
tribution of competence scores were such that 
a patient high on but one or two competence 
indices could be included in the High group. 

These considerations, in conjunction with 
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the finding of Study I, suggested that a more 
refined study was in order. Employing the out- 
come measures used in Study I, groups that 
were clearly high or low on the competence 
variable were compared, special attention being 
given to the diagnostic and treatment factors. 


Method 


High and low social competence groups. In the present 
study, any patient who fell above the median of the 
distribution on at least four of the six social competence 
indices was placed in the High social competence group. 
Any patient who fell below the median on at least four 
of the six indices was placed in the Low social com 
petence group. This procedure resulted in 30 and 36 
patients being assigned to the High and Low social 
competence groups, respectively. The remaining pa 
tients who did not meet one of these two criteria were 
removed from further consideration. The diagnostic 
composition of the two groups appeared to be fairly 
comparable. The High group was composed of 16 indi 
viduals diagnosed as schizophrenic, 7 manic depressives, 
and 7 psychoneurotics. The Low group was composed of 
25 schizophrenics, 4 manic depressives, 3 psycho 
neurotics, and 4 individuals diagnosed as suffering from 
a character disorder. An examination of the case his- 
tories of the two groups revealed no systematic differ- 
ences between the groups on psychiatric treatment re- 
ceived. In the High group, 15 individuals received some 
form of convulsive therapy, 5 received psychotherapy, 7 
received a combination of convulsive therapy and 
psychotherapy, with the remaining 3 patients receiving 
routine hospital care. In the Low group, 17 individuals 
received some form of convulsive therapy, 7 received 
psychotherapy, 8 received a combination of convulsive 
therapy and psychotherapy, with the remaining 4 pa- 
tients receiving routine hospital care. 


Results 


The two groups were initially compared on 
the time interval which had elapsed between 
admission to the hospital and the carrying out 
of the present study. The mean time intervals 
for the Low and High groups were 9.6 and 
10.5 years, respectively. This difference was 
not a statistically significant one (¢ < 1.00). 

The first outcome measure to be analyzed 
was length of institutionalization. Seven in- 
dividuals in the Low and one individual in the 
High group had not been discharged from the 
hospital since their original admission, The 
length of institutionalization ascribed to each 
of these patients was that time interval be- 
tween their admission and the initiation of 
the present study. The mean lengths of insti- 
tutionalization for individuals in the Low and 
High groups were 47.6 and 22.7 months, 
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respectively. This difference is a highly sig- 
nificant one (¢ = 3.15, p < .005). In order to 
assess whether this difference was due only to 
the extreme scores of individuals who have 
remained in the hospital, a second analysis 
was done, excluding such subjects from con- 
sideration. Without the seven subjects in the 
Low and the one subject in the High group, 
the mean lengths of institutionalization for 
subjects in the Low and High groups were 
30.9 and 19.6 months, respectively. Although a 
conservative estimate of the difference be- 
tween the two groups, this difference remains 
a highly significant one (¢ = 2.76, p < .005), 
Since the two groups were not perfectly 
matched on diagnosis, an analysis was done 
on length of institutionalization of only the 
schizophrenics in the High and Low groups. 
The mean length of institutionalization for the 
schizophrenics in the Low and High groups 
were 46.4 and 26.2 months, respectively. This 
difference is also a significant one (¢ = 1.85, 
p < 05). 

The next outcome measure to be analyzed 
was return to the hospital. Of the 29 patients 
discharged in the High group, 4 were read- 
mitted to a psychiatric setting. Of the 29 
patients discharged in the Low group, 11 were 
readmitted. This difference is a significant one 
(x? = 4.40, p < .02). Fisher’s exact method was 
employed to see whether this difference ob- 
tained for the schizophrenics alone. Of the 20 
discharged schizophrenics in the Low group, 
6 were rehospitalized, whereas only 1 of the 15 
discharged schizophrenics in the High group 
was rehospitalized. This difference is also a 
significant one (p < .05). 

In regard to the time interval between dis- 
charge and readmission, no significant differ- 
ence was found between the Low (M = 21.6 
months) and High (M = 22.8 months) indi- 
viduals (¢ < 1.00). A final analysis was done 
which combined both the length of institu- 
tionalization and the readmission measures. 
Of the 30 patients in the High group, 5 were 
either still in or had returned to a hospital. Of 
the 36 patients in the Low group, 18 were 
either still in or had returned to a hospital. 
The difference is significant (x? = 8.14, p < 
005). It may be concluded that individuals 
in the Low group have both a longer period of 
institutionalization and a greater likelihood 
of rehospitalization than individuals in the 


High group. Thus the findings of the second 
study confirm those reported in the first. 


DISCUSSION 


Findings of the present study support the 
hypothesis that a positive relationship exists 
between premorbid social competence and 
prognosis. These findings within the schizo- 
phrenic portion of the sample employed are 
in agreement with earlier reported findings 
(Farina & Webb, 1956; Henderson & Gillespie, 
1936; Hunt & Appel, 1936; Mauz, 1930: 
Phillips, 1953; Stalker, 1939; Strauss, 1931) 
that a relationship obtains between premorbid 
adequacy and prognosis. It should be noted 
that in the study reported here, groups that 
were almost perfectly matched in regard to 
type of psychiatric treatment differed sig- 
nificantly on the outcome measures. It would 
thus appear that premorbid social competence 
is a more significant variable in determining 
outcome in a psychiatric disorder than is 
treatment received. 

Previous studies (Grace, 1956; Lane, 1955; 
Smith & Phillips, 1959) have related social 
competence to performance on Rorschach 
measures of perceptual development. Based 
on the use of these indices, Levine (1959) re- 
ported a relationship between length of hos- 
pitalization and developmental Rorschach 
indices. The results of the present study, in 
conjunction with the findings of these earlier 
studies, support the view that a theoretically 
oriented, global approach to the problem of 
prognosis is superior to an atomistic approach 
employing isolated variables. 

Although there is clear evidence that a rela- 
tionship exists between maturity and outconic, 
further work must be done in order to isolate 
those specific components of higher develop- 
ment that eventuate in a good prognosis. There 
is a certain face validity in the view that the 
greater the psychological resources of the in- 
dividual the better is his prognosis. However, 
it is the present investigators’ belief that a 
specific delineation of those factors related to 
a better prognosis is needed. 

Several hypotheses suggest themselves. The 
first of these is that a more mature person can 
more adequately utilize the opportunities for 
change afforded by a psychotherapeutic rela- 
tionship. Such a view would appear to underlie 
the selection of patients for psychoanalytic 
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and, to a lesser extent, other forms of psycho- 
therapy. Furthermore, it is possible also that 
an interaction effect exists between the effec- 
tiveness of the organic therapies and personal 
maturity. 

Asecond hypothesis is suggested by the view 
of Phillips and Rabinovitch (1958) that the 
more mature individual is more likely to have 
incorporated the values of society and displays 
considerable guilt and anxiety when he does 
not successfully meet these values. Therefore, 
it may be hypothesized that a maladaptive, 
pathological solution to life’s problems is much 
less acceptable to the developmentally high 
than to the developmentally low individual. 
This unacceptability of a pathological solution 
should result in an improved prognosis. 

A final hypothesis may be related to the 
nature of the problems which precipitate the 
mental disorder. As earlier studies have re- 
ported, in individuals who have displayed 
good premorbid social adequacy, there is a 
tendency for a specific traumatic event to 
precipitate the disorder (Becker, 1956; 
Garmezy & Rodnick, 1959; Kantor, Wallner, 
& Winder, 1953; Phillips, 1953; Weiner, 1958; 
Zimet & Fine, 1959). Antithetically, in indi- 
viduals with poor premorbid social adequacy, 
there is a tendency for the disorder not to be 
preceded by such events. Instead, these indi- 
viduals show an insidious history of in- 
appropriate socialization. Although higher 
levels of maturation correspond to specific 
problems unique to those levels, there are 
certain basic problems of socialization which 
must be mastered by all individuals. Thus it 
may be hypothesized that the developmentally 
low individual does not have the resources to 
master successfully these basic problems, and 
this lack mitigates against a good prognosis. 
On the other hand, the developmentally high 
individual has mastered these basic problems 
and must only deal with later problems, corre- 
sponding to his higher level of maturity. Since 
these real problems of adulthood are more 
circumscribed, there is a greater likelihood that 
they can eventually be mastered thus reversing 
the pathological process. 

The findings of the present study offer 
further evidence that the developmental 
approach has considerable value for our under- 
standing of psychopathology. In conjunction 
with the results of earlier studies (Phillips & 
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Zigler, 1961; Zigler & Phillips, 1960, 1961), 
it may be concluded that level of maturity as 
defined by social competence indices is related 
to the incidence of mental disorder, symptoms 
manifested, diagnosis received, and prognosis. 

This body of work is in essential agreement 
with the position that mental disorders are 
continuing processes in which the premorbid, 
initial, middle, and ultimate stages are inter- 
related (e.g., Zubin & Windle, 1954). The 
developmental position advanced in this and 
earlier studies appears capable of providing a 
conceptual framework within which such inter- 
relationships become meaningful. 

This position suggests that psychopathology 
can best be conceptualized as a unitary phe- 
nomenon within which the various types of 
mental disorders are also meaningfully related, 
rather than as a collection of discrete entities. 
Clearest evidence for the validity of this view 
would be the isolation and utilization of dimen- 
sions whose applicability transcends individual 
disorders, being applicable instead to all of 
psychopathology. One such dimension is that 
of social competence as employed in this study. 
The applicability of this dimension to all of 
psychopathology is demonstrated by the find- 
ing of the present study that the relationship 
between social competence and prognosis is 
found for both schizophrenic and nonschizo- 
phrenic groups. 

This complex of good premorbid adjustment 
and good prognosis has, in combination with 
type of onset, been employed to differentiate 
the reactive from the process type of schizo- 
phrenia. This process-reactive distinction has 
in turn been related to an organic-functional 
dichotomy in etiology (Becker, 1956; Bellak, 
1948; Brackbill, 1956; Finkelstein, 1953; 
Kantor et al., 1953). The finding that similar 
complexes of premorbid social competence 
and prognosis may be found within diagostic 
groups whose etiology has conventionally been 
considered nonorganic, e.g., psychoneurotics 
and character disorders, calls into question 
the view that such complexes mirror the 
presence or absence of organic involvement. 
This view is further called into question by 
the finding that type of onset, one variable 
employed in making the process-reactive dis- 
tinction, is also related to the premorbid social 
adequacy (Phillips, 1953). The findings re- 
ported here thus suggest that the process- 
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reactive distinction is not unique to schizo- 
phrenia and may be entirely reducible to the 
social competence dimension. 


SUMMARY 


Two studies testing the hypothesis that a 
relationship exists between a patient’s pre- 
morbid social competence and the outcome of 
his disorder were reported. The case histories of 
251 patients were examined in order to estab- 
lish High and Low social competence groups. 
The variables of age, intelligence, education, 
occupation, employment history, and marital 
status were employed as indices of social 
competence. It was found that patients in the 
Low group have both a longer period of institu- 
tionalization and a greater likelihood of re- 
hospitalization than patients in the High 
group. These findings were discussed within the 
framework of the developmental approach to 
psychopathology. 


REFERENCES 


Bayarp, J., & PAscat, G. Studies of prognostic criteria 
in the case records of hospitalized mental patients. 
J. consult. Psychol., 1954, 18, 122-126. 

Becker, W. A genetic approach to the interpretation 
and evaluation of the process-reactive distinction 
in schizophrenia. J. abnorm. soc. Psychol., 1956, 53, 
229-236. 

BELiLAK, L. Dementia praecox. New York: Grune & 
Stratton, 1948. 

Biatr, 1D. Prognosis in schizophrenia. J. 
1940, 86, 378-477. 

Bowman, K., & Raymonp, A. Physical findings in 
schizophrenia. A mer. J. Psychiat., 1929, 8, 901-913. 

BRACKBILL, G. Studies of brain dysfunction in schizo- 
phrenia. Psychol. Bull., 1956, 68, 210-226. 

Crass, L. S., & SttverMAN, S. Prognostic criteria in 
schizophrenia. Amer. J. Psychiat., 1941, 98, 360- 
368. 

Coie, M., SWENSEN, C., & Pascat, G. Prognostic sig- 
nificance of precipitating stress in mental illness. 
J. consult. Psychol., 1954, 18, 171-175. 

Dewan, J. Intelligence and emotional stability. Amer. 
J. Psychiat., 1948, 104, 548-554. 

Dunnam, H., & MELTzER, B. Predicting length of hos- 
pitalization of mental patients. Amer. J. Sociol. 
1946, 62, 123-131. : 

Farina, A., & Wess, W. Premorbid adjustment and 
subsequent discharge. J. mnerv. ment. Dis., 1956, 
124, 612-613. 

FEtLpMAN, D., Pascat, G., & SWENSEN, C. Direction of 
aggression as a prognostic variable in mental ill- 
ness. J. consult. Psvchol., 1954, 18, 167-170. 

FELLNER, C., & Wet, P. Low normal intelligence and 
schizophrenia. Amer. J. Psychiat., 1955, 112, 349 
353. 


ment. Sci., 


EDWARD ZIGLER AND LESLIE PHILLIPS 


FINKELSTEIN, Z. A study in schizophreniform psy 
choses. Acta psychiat. neurol. Scand., Kbh., 1953. 
28, 45. 

Frencu, T., & KASANIN, J. A psychodynamic study of 
the recovery of two schizophrenic cases. Psycho- 
anal. Quart., 1941, 10, 21-22. 

GarMeEzy, N., & Ropnick, E. Premorbid adjustment 
and performance in schizophrenia. J. nerv. ment 
Dis., 1959, 129, 450-4066. 

Grace, N. A developmental comparison of word usage 
with structural aspects of perception and social 
adjustment. Unpublished doctoral dissertation, 
Duke University, 1956. 

GuTTMANN, E., Mayer-Gross, W., & Sater, E. 
Short distance prognosis in schizophrenia. J. 
neurol. Psychiat., 1939, 2, 25-34. 

HENDERSON, D., & GILLEsPIE, R. A éextbook of psy 
chiatry. London: Oxford Univer. Press, 1936. 
Hunt, R., & Appet, K. Prognosis in the psychoses 
lying midway between schizophrenia and manic- 
depressive psychoses. Amer. J. Psychiat., 1936, 

93, 313-339, 

Huston, P., & PepeRNicK, M. Prognosis in schizo 
phrenia. In L. Bellak (Ed.), Schizophrenia. New 
York: Logos, 1958. Pp. 531-546. 

Jaco, J. Prediction of outcome-on-furlough of demen- 
tia praecox patients. Genet. psychol. Monogr., 1940, 
22, 425-453. 

Kant, O. Evaluation of prognostic criteria in schizo- 
phrenia. J. merv. ment. Dis., 1944, 100, 598-605. 

Kantor, R., WALLNER, J., & Wryper, C. Process and 
reactive schizophrenia. J. consult. Psychol., 1953, 
17, 157-162. 

LANE, J. Social effectiveness and developmental level 
J. Pers., 1955, 23, 274-284. 

LANGFELDT, G. The prognosis in schizophrenia and the 
factors influencing the course of the disease. London: 
Humphrey Milford, 1937. 

Levine, D. Rorschach genetic level and mental dis 
order. J. proj. Tech., 1959, 23, 436-439. 

Lewis, N., & BLANCHARD, E. Clinical findings in “re- 
covered” cases of schizophrenia. A mer. J. Psychiat, 
1931, 11, 481-492. 

MALAmup, W., & RENDER, N. Course of prognosis in 
schizophrenia. Amer. J. Psychiat., 1939, 95, 1039 
1057. 

MALamup, W., SANDs, S., MALAmMuD, I., & Powers, P. 
The involutional psychoses: A socio-psychiatric 
follow-up study. Amer. J. Psychiat., 1949, 106, 
567-572. 

Mavz, F. Die Prognostik der engogen Psychosen. Leipzig: 
George Thieme, 1930. 

Mayer-Gross, W., & Moore, N. P. Schizophrenia 
Recent progress in psychiatry. J. ment. Sci., 194, 
90, 231-251. 

MEERL, P. Clinical versus actuarial prediction. Min 
neapolis: Univer. Minnesota Press, 1954. 

Pascat, G., SWENSEN, C., FeLpMaAN, D., Core, M., & 
Bayarp, J. Prognostic criteria in the case histories 
of hospitalized mental patients. J. consult. Psychel., 
1953, 17, 163-171. 

Puituips, L. Personality factors and prognosis in 
schizophrenia. Unpublished doctoral dissertation, 
University of Chicago, 1949. 








| 


nm 





1 psy 


1953, 


dy of 
'svcho- 


tment 
ment. 


usage 
social 
ation, 


psy 


hoses 
anic- 
1936, 


h izo 
New 


men- 
1940, 


hizo- 
)5. 

- and 
953, 


evel 


1 the 
don: 


dis 





SocIAL COMPETENCE AND OUTCOME IN PSYCHIATRIC DISORDER 271 


Puiips, L. Case history data and prognosis in schizo- 
phrenia. J. nerv. ment. Dis., 1953, 117, 515-525. 
PurtLips, L. Studies in social competence. Paper read 
at Eastern Psychological Association, New York, 

April 1960. 

Parurrs, L., KApEN, S., & WatpMan, M. Rorschach 
indices of developmental level. J. genet. Psychol., 
1959, 94, 267-285. 

Puttips, L., & Rasrnovitcs, M. Social role and pat- 
terns of symptomatic behaviors. J. abnorm. soc. 
Psychol., 1958, 57, 181-186. 

Puturs, L., & ZicLeR, E. Social comptence: The 
action-thought parameter and vicariousness in 
normal and pathological behaviors. J. abnorm. soc. 
Psychol., 1961, 63, 137-146. 

Rennie, T. Follow-up study of 500 patients with 
schizophrenia admitted to the hospital from 1913 
to 1923. Arch. Neurol. Psychiat., 1939, 42, 877-891. 

Rupp, C., & FLercuer, E. Five to ten years follow-up 
study of 641 schizophrenic cases. Amer. J. Psy- 
chiat., 1940, 96, 877-888. 

ScnorreLp, W., HatHaway, S., Hastincs, D., & BELL, 
D. Prognostic factors in schizophrenia. J. consult. 
Psychol., 1954, 18, 155-166. 

SILVERMAN, D. Prognosis in schizophrenia: A study of 
271 cases. Psychiat. Quart., 1941, 15, 477-493. 
Surrn, L., & Purcires, L. Social effectiveness and de- 
velopmental level in adolescence: J. Pers., 1959, 27, 

239-249. 

StaLKER, H. The prognosis in schizophrenia. J. ment. 
Sci., 1939, 85, 1227-1240. 

Strauss, E. Some principles underlying prognosis in 


schizophrenia. Proc. Roy. Soc. Med., Sect. Psy- 
chiat., 1931, 24, 27-32. 


STRECKER, E., & Wittey, G. Prognosis in schizo- 
phrenia. Proc. Assoc. Res. Nerv. Ment. Dis. 1928, 
5, 403-431. 

SULLIVAN, H. Tentative criteria of malignancy in 
schizophrenia. Amer. J. Psychiat., 1928, 74, 759- 
787. 

SwENsEN, C., & Pascat, G. Duration of illness as a 
prognostic indicator in mental illness. J. consult. 
Psychol., 1954, 18, 363-365. (a) 

SWENSEN, C., & Pascat, G. Prognostic significance of 
type of onset of mental illness. J. consult. Psychol., 
1954, 18, 127-130. (b) 

Untrep States GOVERNMENT PRINTING OFrice. The 
dictionary of occupational titles. Washington: 
USGPO, 1949. 

Werner, H. Diagnosis and symptomatology. In L. 
Bellak (Ed.), Schizophrenia. New York: Logos, 
1958. Pp. 107-173. 

ZIGLER, E., & Puivurps, L. Social effectiveness and 
symptomatic behaviors. J. abnorm. soc. Psychol., 
1960, 61, 231-238. 

ZIGLER, E., & Purtvups, L. Case history data and psy- 
chiatric diagnosis. J. consult. Psychol., 1961, 26, 
458. 

Zimet, C., & Fine, H. Perceptual differentiation and 
two dimensions of schizophrenia. J. nerv. ment. Dis., 
1959, 129, 435-441. 

ZuBIN, J., & Wrnvie, C. Psychological prognosis of 
outcome in the mental disorders. J. abnorm. soc. 
Psychol., 1954, 49, 272-281. 


(Received August 18, 1960) 








Journal of Abnormal and Social Psychology 
1961, Vol. 63, No. 2, 272-282 


PATTERNS OF CARDIAC AROUSAL DURING COMPLEX 
MENTAL ACTIVITY 


SIDNEY J. BLATT! 
Michael Reese Hospital 


N recent years, trends in such divergent 

fields as motivational theory, experi- 

mental psychology, and psychoanalytic 
ego psychology have reflected converging 
emphasis on the organism’s capacity and 
desire to interact effectively with its environ- 
ment. This adaptive motivational force was 
noticed early in animal research in the animal’s 
exploration, search, and observation of its 
surroundings. The adaptive aspect of this 
behavior has been demonstrated by the re- 
search on latent learning and incidental learn- 
ing. Extensive recent research on exploration, 
curiosity, activity, and the need for novelty in 
animals has been well summarized (Berlyne, 
1960; Hebb, 1955; White, 1959) and indicates 
the basic need of the organism to explore a 
novel situation or to create a stimulus change 
in a repetitive and bland environment (Zim- 
bardo & Miller, 1958). These findings, as well 
as the research on the severe effects of sensory 
deprivation (Hebb, 1958; Lilly, 1956) have 
compelled many theorists to formulate the 
organism’s need for effective interaction with 
its environment as a motivational force that is 
relatively independent of any primary drive 
or tissue need (White, 1959; Woodworth, 
1958). 

A similar trend and emphasis has also been 
apparent in the development of psycho- 
analytic theory (Gill, 1959), where the ego 
theorists all have highlighted the autonomous 
ego functions: those aspects of the personality 
structure concerned with adaptation,? which 
are functionally independent of instinctual 

1 The author wishes to express his appreciation to 
Roy R. Grinker, Sheldon J. Korchin, Sara K. Polka, 
and Morris I. Stein for their comments during this 
study and for making available funds and equipment. 
The author also expresses appreciation to Helen Heath 
for aid with the statistical analysis, to Paul Cekan for 
assistance in the polygraph instrumentation, and to 
Charles Greenberg for aid in scoring the data. 

The author’s current address is the Department of 
Psychology, Yale University, New Haven, Connecticut. 

* Adaptation as used in this paper is more than just 
a passive conformity, but rather includes an active, 
creative, constructive coping and interaction with the 
environment, 
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drives and derive in part from an independent 
source. Hendrick (1942) also wrote of an 
“instinct to master,’ an “inborn drive to do 
and to learn how to do,” a “pleasure in ex- 
ercising a function successfully regardless of 
its sensual value.” Fenichel (1945) wrote of 
mastery as a “general aim of every organism 
but not a specific instinct,” “a pleasure of 
enjoying one’s abilities” that derives from the 
pleasure of functioning without anxiety. 
Actually this motivational force for exploration 
and adaptive functioning was discussed by 
Biihler (1930) as the Funktionlust and by 
Freud (1927, 1949) when he alluded to the 
fact that ego functions are supplied with their 
own energy independent of instincts, and that 
there is pleasure in their exercise. Intrinsic 
satisfaction in functioning and the desire for 
stimulation and growth are seen in many 
personality theories (Goldstein, 1939; Maslow, 
1954; Rogers, 1951) as the sine qua non of 
mental health and psychological maturity. It 
is these factors that are impaired in pathology, 
distorted by conflict, and thwarted by de- 
fensive maneuvers. For some theorists, such as 
Woodworth (1958) and White (1959), coping 
or dealing with the environment is viewed as a 
fundamental element in motivation. 
Cognitive processes are an essential com- 
ponent of adaptive functioning. Curiosity, 
exploration, and the need for novelty all 
imply a need for intellectual stimulation. 
Playful exploration, the desire to effect a 
stimulus change, the enjoyment of work and of 
novelty all stress intrinsic satisfaction in 
cognitive functioning. As Hebb (1955) has 
insistently pointed out, we “underestimate the 
human need of intellectual activity” and the 
degree to which man’s activity is spent in 
raising the level of stimulation and excitement. 
Hebb (1955) has conceptualized excitement 
and exploration in physiological terms and 
views it as serving an arousal or vigilance 
function that establishes a level of cortical 
excitation without which learning could not 
occur. Arousal serves as a drive, as an ener- 
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gizer, and efficient learning or functioning is 
only possible when the level of arousal is high. 
In extreme situations, however, such as those 
provoking paralysis from terror or fright, 
excessive arousal interferes with functioning. 
Such an “inverted U shaped” relationship 
between arousal (as measured by autonomic 
variables) and level of performance has been 
frequently demonstrated (Duffy, 1951, 1957; 
Freeman, 1940; Malmo, 1957). There is an 
optimal level of arousal for efficient perform- 
ance, and levels of arousal either below or (in 
extreme situations) above this optimal level 
interfere with effective functioning. The 
inverted U shaped curve, however, has been 
demonstrated mainly in the relationship be- 
tween arousal and functioning in relatively 
simple tasks such as reaction time, rotary 
pursuit, and mirror tracing. Scant data are 
available about the relationship between 
arousal and efficiency of thought in complex 
problems. From the conceptualization of the 
arousal continuum, similar relationships should 
exist between autonomic arousal and complex 
problem solving. The purpose of the present 
study is to explore the relationships between a 
measure of autonomic arousal (cardiac rate) 
and the efficiency of complex mental activity. 

Duffy (1951, 1957) proposes two major 
dimensions of arousal: intensity and direction. 
Direction, expressed in dynamic character- 
istics of the behavior that occur concomitant 
with arousal, indicates the degree to which the 
arousal and behavior are goal directed. Thus 
not only should autonomic arousal be high in 
eficient functioning, but the arousal reactions 
should also have direction and should occur in 
response to important points in the behavior or 
thought process. 

In prior research (Blatt, 1958; Blatt & 
Stein, 1959) a model was developed of the 
process by which subjects solve a complex 
logical problem. In this model of the problem 
solving process two crucial points were identi- 
fied: where the subject had available, im- 
plicitly at least, the necessary and sufficient 
information for solution, and where the sub- 
jects’ predominant activity shifts from ques- 
tions of analysis about one-to-one relationships 
to the more complex questions of synthesis 
that attempt to organize and integrate in- 
formation to achieve solution (analysis- 


synthesis shift). These two points, which 


subjects are not able to report, delineate three 
phases of the problem solving process. The 
initial phase extends from the beginning of the 
problem to the point at which the subject has 
available implicitly the necessary and sufficient 
information for solution. This is followed by 
the lag phase, in which the subject gathers 
additional information prior to shifting from 
analysis to synthesis. In the synthesis phase 
the subject’s behavior reflects primary concern 
with coordinating the information he has 
obtained. 

The body of research and theory reviewed 
at the outset leads to the expectation that 
efficiency in complex mental activity should 
be characterized by heightened arousal, which 
should occur, in part, at important points in 
the thought process. The following hypotheses 
were the focus of the present study: 

1. Efficient problem solvers have a higher 
level of cardiac rate and a greater variability 
of cardiac rate than inefficient ones during 
complex mental activity. 

2. Among efficient problem solvers, eleva- 
tions in the cardiac rate occur at those points 
in the thought process at which necessary and 
sufficient information for solution has become 
available, at which the predominant activity 
has changed from analysis to synthesis, and at 
solution. 


METHOD 


Concurrent recordings of cardiac rate were obtained 
as subjects attempted to solve problems on the John- 
Rimoldi Problem Solving Apparatus (PSI). Since a 
more detailed description of the PSI apparatus is avail- 
able (Blatt & Stein, 1959; John, 1957), only a brief 
description of the apparatus is presented here. A 
diagrammatic representation of the demonstration 
problem and its solution are presented in Figure 1. 

On the apparatus the subject is presented with a 
panel containing a circular array of nine lights, plus a 
center light. Next to each of the nine outer lights there 
is a button, which, when pressed, lights up its cor- 
responding light. Some of the lights are interrelated, so 
that when Button A is pressed, Light A comes on, 
followed in the next time cycle (3 seconds later) by 
Light B, with which it was related. The existence of a 
relationship between the lights is indicated by arrows 
on a removable disc, and there are different discs for 
each problem. An arrow indicates one of three types of 
relationships: 

1. A direct one-to-one effect—such that the 
activation of A causes B to light in the next time 
interval. 

2. A facilitatory or combiner efiect—such that A 
plus another light, X, which also has an arrow going 
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Fic. 1. Diagrammatic representation of the disc and the solution sequence of the demonstration problem. 


to B, can light B if, and only if, A and X are lit 

simultaneously. 

3. A blocking effect—such that A prevents X 

(which has an arrow to B) from lighting B, ie., X 

can light B only if A has not been activated. The 

subject is instructed about the types of relationships 
that exist on the apparatus, but he is left to discover 
or infer the specific relationships within each problem. 

In each problem the subject’s task is to discover the 
one correct sequence of the three buttons at the bottom 
of the circle that will light the center light, which has no 
activating button. The subject may use any of the 
buttons that he wishes to discover relationships, but 
may use only the three buttons at the bottom in the 
final solution. 

Thus on the apparatus the subject must press 
buttons to gather information about the logical rela- 
tionships within the problem and must also press but- 
tons when testing his various attempts to integrate this 
information. All button presses are automatically re- 


corded, creating a complete record of the problem solv- 
ing performance. The subject’s problem solving process 
can be reconstructed sequentially, step by step, and 
each of the subject’s responses identified as necessary 
for solution or not. Each sequence of interacting button 
presses is regarded as putting a question to the ap- 
paratus and the number of unnecessary questions the 
subject asks while solving the problem measures his 
problem solving efficiency. Also in the sequential 
analysis of the responses, the crucial points of nec- 
essary and sufficient information and analysis-syn- 
thesis shift can be identified for each subject. The 
analysis-synthesis shift point is the point that best 
meets the following two conditions: most of the 
analytic questions have already been asked, and most 
of the synthetic questions still remain to be asked. 
Analytic questions are defined as those in which one 
side of the question is unity and the question is asked 
to find the constituent parts of that unity (Duncker, 
1945): there is one cause and one effect, or one cause 
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and multiple effects, or multiple causes with a single 
efect. Synthetic questions are those in which there are 
multiple causes and multiple effects, where the ques- 
tion involves an attempt to integrate information. The 
analysis-synthesis shift point is located where the 
difference between the percentage of analytic questions 
asked and the percentage of synthetic questions asked 
js maximum (a/A— s/S = Maximum). (a and s are, 
respectively, the number of preceding analytic and 
synthetic questions.) This maximum value also in- 
dicates the degree to which the problem solving process 
was organized into these two phases. For example, if 
all the analytic questions had already occurred before 
a specific point (a/A = 100) and all the synthesis 
questions were still to be asked (s/S = 0), then the sepa- 
ration of the phases would be perfect or 100. A less clear 
gparation between the two phases would be cor- 
respondingly indicated by smaller values. 

Cardiac rate was recorded by chest leads throughout 
the entire experimental procedure of instructions, 
practice problem (presented to the subjects as first 
experimental problem), experimental problem, and four 
interposed rest periods (10 minutes each). Cardiac rate 
was measured for each 30-second interval during the 
entire experiment, the measure of heart rate being 
prorated to 440 beat at each end of these intervals. The 
mean and standard deviation of these values were 
obtained for each of the experimental periods. Though 
the issue of response specificity (Lacey, 1950) arises, 
recent research (Schnore, 1959) indicates that heart 
rate reflects most consistently the general level of 
arousal. Cardiac rate was selected as the physiological 
measure of arousal in the present study also because it 
is among the more reliable autonomic measures, re- 
sponds rapidly, and can be recorded continuously with- 
out drastically limiting the subject’s activity and move- 
ment. 

Eighteen first- and second-year male graduate 
students who had volunteered to take part in a study 
of problem solving served as subjects. They were paid 
$6.00 each for participating in the study, which at the 
maximum required 3 hours. After the experimental 
procedures the subjects were asked for retrospective 
accounts of their problem solving process, their con- 
ceptualization of the problem, the methods by which 
they attempted to solve it, their awareness of any 
crucial moments, their feelings about the problem, and 
the degree of arousal they experienced during the 
course of the experimental problem. 


RESULTS 
Efficient vs. Inefficient Problem Solvers 


On the basis of the problem solving per- 
formance on the experimental problem, the 
18 subjects were divided into two groups of 
nine “efficient” (11-50 unnecessary questions) 
and 9 “inefficient” subjects (73-161 unneces- 
sary questions). All subjects completed the 
practice problem; 2 of the 9 inefficient subjects 
did not solve the experimental problem in the 
allotted time of 1 hour. 





TABLE 1 
ANALYSIS OF VARIANCE OF MEAN CARDIAC RATE FOR 
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Subjects Source | MS F p 
ee = _— 
Total (V = 18) | Groups 1 | 445.41 | 3.05] .10 

Error term 16 | 145.96 
Occasions |} 6| 43.79 }10.85 | <.001 
Group X Occasion 6 | 18.68 | 4.63 |<.001 
Error term 196) 4.04 } 
Efficient Occasion 6 28.22 | 7.68 |<.001 
(N = 9) Error term 48 3.67 | 
Inefficient Occasion 6| 34.25 | 7.78 |<-001 
(VN = 9) Error term | 48) 4.40 
TABLE 2 
MEAN CarpbIAC RATE OF EFFICIENT AND 
INEFFICIENT SUBJECTS 
Gentes Subjects ‘subjects, (0R6- 
(V=9) (N =9)_ ‘ailed) 
Initial rest (A) 87.32 84.18 ns 
Instructions 88.60 85.26 ns 
Rest (B) 88.18 83.82 ns 
Practice problem 96.24 87.92 <.10 
Rest (C) 89.28 80.82 <.05 
Experimental problem 94.16 80.80 <.01 
Rest (D) 87.50 75.84 <.01 





The mean and the standard deviation of 
cardiac rate for each of the seven experimental 
occasions were obtained from the 30 second 
measures of cardiac rate. Table 1 presents a 
comparison of the mean cardiac rates, using a 
Lindquist (1953) Type I analysis of variance. 
The total between-group difference in cardiac 
rate for the entire experimental procedure 
approached significance at the .10 level, 
suggesting that the efficient group had a 
higher over-all cardiac rate. There was a 
significant total between-occasion difference 
(<.001), indicating that the various experi- 
mental occasions evoked different levels of 
cardiac rate. The between-occasion difference 
within each of the groups was highly signifi- 
cant (<.001), indicating that the different 
experimental occasions evoke changes in rate 
for both the efficient and inefficient subjects. 

Table 2 presents the mean cardiac rate for 
the two groups during each experimental 
occasion and a comparison by ¢ test of the 
groups during each of the occasions. The two 
groups did not differ significantly in mean 
cardiac rate during the early experimental 
conditions of initial rest and instructions. 
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After the instructions, however, the cardiac 
rate of the efficient subjects became signifi- 
cantly greater than that of the inefficient 
subjects. The difference between the two 
groups reached statistical significance during 
the practice problem and continued significant 
for the remainder of the experiment, being 
particularly marked during the experimental 
problem. 

To test the differences in level of cardiac 
rate within each group between the different 
occasions, the Lindquist (1953) procedure for 
testing simple effects in Type I analysis of 
variance with significant Group X Occasion 
interaction was used. Within the efficient 
group the cardiac rate during the practice and 
experimental problems was _ significantly 
greater (<.05) than it was in the other con- 
ditions. Within the inefficient groups the mean 
cardiac rate during the practice problem was 
significantly greater (<.05) than during all 
the other occasions except the initial rest and 
the instructions. However, the cardiac rate of 
the inefficient group during the experimental 
problem was significantly lower (<.05) than 
it was during the initial rest, instructions, and 
practice problem. 

In comparing the variability of heart rate 
of the two groups during the seven occasions, 
the same statistical procedures were employed. 
The standard deviation of each subject’s 
cardiac rate was obtained for each of the 
experimental occasions from the 30-second 
measures of cardiac rate. Table 3 presents a 
comparison of these standard deviations of 
cardiac rate, using a Lindquist Type I analysis 
of variance. The total between-group difference 
was significant at less than the .025 level, 
indicating a greater over-all infraindividual 
variability of cardiac rate for the efficient 
group. There was a significant total between- 
occasion difference (<.001), indicating that 
the various experimental occasions evoked 
different levels of variability. The between- 
occasion difference within each group was 
obtained since there was a significant Group 
X Occasion interaction, The between-occasion 
difference within the efficient group was 
highly significant (<.001); however, the 


between-occasion difference for the inefficient 
group was not significant and in fact, 
proached zero. 
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TABLE 3 

ANALYSIS OF VARIANCE OF STANDARD DEVIATION oF 
CaRDIAC RATE FOR EFFICIENT AND 

INEFFICIENT SUBJECTS 

Subjects Source 








Total (V = 18) Groups } 1 [13.71 | 7.01 |<.02 
Error term 16 | 1.96 | 
| Occasions 6 | 3.01 | 5.28 <.001 
| Group X Occasion | 6 | 2.85 | 5.00 |<.001 
Error term | 96 57 
Efficient (V = 9) | Occasion 6 | 5.69 | 5.95 | <.001 
| Error term 48 | .96 
Inefficient (V = 9)| Occasion | 6} 16) .88| as 
| Error term | | 48] .18 
TABLE 4 


MEAN STANDARD DEVIATION OF CARDIAC RATE FoR 
EX FFICIENT AND INEFFICIENT SUBJECTS 











e ; pa me Raetiient 9 (ear 
ccas ts t 

asen tv oo in ce tailed) 
Initial rest (A) 3.628 3.380 ns 
Instructions 3.850 3.366 ns 
Rest (B) 4.112 3.882 ns 
Practice problem 7.530 3.866 <.0l 
Rest (C) 4.274 3.678 ns 
Experimental problem 6.794 3.180 .005 
Rest (D) 3.934 3.534 ns 

TABLE 5 


ANALYSIS OF VARIANCE OF CARDIAC RATE OF 
EFFICIENT AND INEFFICIENT SUBJECTS 
DURING Srx OCCASIONS OF THE 
PROBLEM SOLVING PROCESS 
_ (Experimental pooblem only) 








Subjects | Source | af | MS | F | 
—— | = ~—e a j | 
Total (V = 18) | Groups 1 002.2 " 04 | .005 

Error term 16 | 199.4 
Occasions 5 | 46.32 | 4.68 l<. 001 
| Group X Occasion S| 44.09 ae <.005 
| Error term | 80 9.89 4 
Efficient Occasion 5 | 89.20 | 5.42 |<.001 
(VN = 9) Error term 40 as | 
Inefficient | Occasion } S| 1.21 | - ns 
(NV = 9) | Error term | 40] 3.32} 








Table 4 presents the mean standard devi- 
ation of cardiac rate for the two groups during 
each of the experimental occasions and 4 
comparison by ¢ test of the groups during each 
of the occasions. Though efficient subjects were 
significantly more variable in cardiac rate 
over the entire experiment, this difference 
resulted mainly from the significant difference 
in variability between the two groups on the 
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Fic. 2. Cardiac rate of efficient and inefficient groups during the three phases and three points of the experi- 


mental problem. 


practice problem (<.01) and the experimental 
problem (<.005). 

To test the differences in variability of 
cardiac rate between the different occasions 
within each of the groups, the Lindquist 
(1953) procedure for testing simple effects in 
Type I analysis of variance with significant 
Group X Occasion interaction was used. Within 
the efficient group the same pattern was found 
for variability of cardiac rate as was observed 
in the analysis of the levels of cardiac rate: 
variability was significantly greater (<.05) 
during the practice and experimental problems 
than during the other experimental occasions. 
Within the inefficient group there were no 
significant differences in variability between 
any of the occasions. 

The results thus far support the hypothesis 
that efficient problem solvers are significantly 
more rapid and variable than inefficient sub- 
jects in cardiac rate while attempting to cope 
with a complex cognitive problem. 


Cardiac Arousal at Crucial Points in Problem 
Solving 

The second hypothesis states that the 
cardiac arousal of efficient subjects should 
occur at crucial points in the problem solving 
process. The three crucial points and the 
three phases of the problem solving process 
that they demark, identify six occasions for 
analysis. By a Lindquist (1953) Type I 
analysis of variance, the cardiac rates for the 
two groups of subjects during each of the six 
occasions were compared (Table 5).’ The 


3A comparison of the intraindividual variation of 
cardiac rate during these six conditions of the experi- 
mental problem was not possible since the measure of 
cardiac rate at the three points of the process is based 
on heart rate during one 30-second interval measure. 
In each of the three phases of the process however, 
efficient subjects had significantly greater variability 
of cardiac rate than inefficient subjects and there were 
no significant within-group differences in variability 
for the three phases. 
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Fic. 3. Cardiac rate of two efficient subjects during the experimental problem. 


total between-group difference of cardiac rate 
during the experimental problem was sig- 
nificant (<.005), and as was indicated earlier, 
the efficient subjects had a higher cardiac 
rate. There was also a significant total be- 
tween-occasion difference (<.001). Examina- 


tion of the between-occasion differences for 
each group separately showed them to be 
almost wholly attributable to the efficient 
group. These findings are presented graphically 
in Figure 2. 

Using a ¢ test for correlated distributions 
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Fic. 4. Cardiac rate of two inefficient subjects during the experimental problem. 


(McNemar, 1955), the cardiac rate at the 
points of necessary and sufficient information, 
analysis-synthesis shift, and solution were 
compared with the mean cardiac rate of the 
phase before and after each of these points. 
As indicated in Figure 2, efficient subjects had 
elevations in cardiac rate at all three points, 
which were significantly greater than the 
mean heart rate of the phase that preceded 
and the phase that followed each point. For 
the inefficient group, on the other hand, only 
one significant difference was noted, between 
the point of necessary and sufficient information 
and the mean value of the lag phase. Figure 
3 presents individual records of the two effi- 
cient subjects who most clearly demonstrate 
the phenomena. Figure 4 presents individual 
records of two inefficient subjects. These data 
support the hypothesis that elevations in 
cardiac rate of efficient subjects accompany 
crucial developments in their problem solving 
processes. 

At the end of the experimental problem 
each subject was asked for a retrospective 


report of his problem solving. None of the 
subjects was aware of, or conceptualized, the 
crucial points of the process. Subjects were 
also asked for a description of their feelings 
while solving the problem and to rate, on a 
six-point scale, their degree of “arousal, 
anxiety, or tension” while solving the experi- 
mental problem.‘ With the score of 6 indicating 
maximum arousal, anxiety, or tension, the 
mean score for the efficient group was 2.71, 
lower, though insignificantly so, than the 
mean score of 3.57 for the inefficient group. 


DISCUSSION 


Arousal, as measured by elevations in cardiac 
rate, is clearly related to the efficiency with 
which subjects solve the complex logical 
problems of the present study. Efficient sub- 
jects had significant increases in cardiac rate 
and variability of cardiac rate during problem 


* Four of the 18 subjects were not given the ques- 
tionnaire about their feelings of arousal during the 
experiment; for this analysis there are 7 rather than 9 
subjects in each group. 








280 


solving both in comparison to their own resting 
baseline and in comparison to the pattern of 
cardiac response of inefficient subjects. These 
findings support the hypothesis that arousal is 
characteristic of efficient functioning and that 
arousal seems to have an adaptive or facili- 
tating effect. The facilitating effect of arousal 
may be, as Hebb (1955) postulates, ‘“‘to tone 
up the cortex with a background supporting 
action that is completely necessary if messages 
are to have their effect’; for “without a 
foundation of arousal the cue (or learning) 
function cannot exist.” In a recent exploratory 
study, Beckman and Stein (1961) report that 
efficiency on the PSI apparatus correlates 
significantly with reduced amounts of alpha 
in resting EEG records. Their findings of a 
general “cortical excitation” in the efficient 
subject are consistent with those of the present 
study and tend to support the interpretation 
that autonomic arousal during efficient func- 
tioning reflects the general tendency toward 
higher levels of cortical excitation. 

The results of the present study also in- 
dicate that arousal is not a total reaction but, 
rather, occurs differentially and, in part, at 
crucial points in the problem solving process. 
Efficient subjects had significant elevations of 
cardiac rate at three crucial points in the 
problem solving process even though they 
could not identify or conceptualize such 
points or stages in retrospective report.® 
Arousal] at these crucial points in the problem 
solving process may not only be indicative of 
the implicit understanding that efficient 
subjects have of the problem, but may also 
serve as a stimulus through which the subject 
begins to make more explicit his implicit 
conceptualization of the problem. Thus in 
addition to facilitating the cue or learning 
function, arousal, or variations in arousal, 
may also have cue properties. The stimulus 
properties of autonomic responses have been 
extensively discussed by Lacey and Lacey 
(1958). 

In the relationship between problem solving 
efficiency and autonomic arousal, the vari- 
ability of autonomic response seems to be as 

* Cardiac arousal at these crucial points in the 
thought process which are not reported by subjects has 
been discussed in an earlier paper (Blatt, 1960) as the 
intuitive, sensitive, affect-laden functioning frequently 
conceptualized as preconscious. 
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important as the level of arousal. Lacey and 
Lacey (1958) have discussed the degree of 
autonomic fluctuation as a reliable individual 
characteristic. They found that fluctuations 
in heart rate and skin resistance during a 
resting state were significantly related to 
motor impulsivity and to the length of time 
an individual could maintain maximal readi- 
ness for response. They concluded that though 
autonomic variability may have disruptive 
effects in motor tasks, there are also indica- 
tions that it has a facilitatory effect on re. 
ceptor-cortical functions. In a variable arousal 
pattern, subjects can respond differentially to 
various aspects and elements within a situ. 
ation, and these differential reactions may 
serve to facilitate functioning. If the facili- 
tating effect of arousal is to “tone up the 
cortex” (Hebb, 1955), a constant state of 
arousal, regardless of level, may reach habitu- 
ation. Changes and variations in arousal level 
may be more effective in maintaining a general 
level of cortical excitation. At the extreme 
ends of the arousal continua where arousal is 
exceedingly low or exceedingly high, there is 
relatively little variability of response. It is 
only in the middle ranges of the arousal 
continua where reactions can vary and fluc- 
tuate. Therefore, this capacity for differential 
reaction and response may be the important 
facilitating effect rather than the absolute level 
of the arousal. 

Some of the psychological and motivational 
counterparts of autonomic arousal seem to be 
the positive attraction of a complex and 
challenging task. In response to a question- 
naire about their feelings during the experi- 
mental problem, efficient subjects frequently 
commented in retrospect about the “satis- 
faction of working out the problem—it was 
like trying to play a musical instrument,” 
“elation as solution was reached—the problem 
was rather stimulating,” “the problem was 
lots of fun, I enjoyed it.” Inefficient subjects 
frequently wrote of “frustration, but I knew | 
would solve it if I kept thinking in terms of 
it,” “annoyance that you must go along only 
one path and continually get stuck at the 
same point along that path,” “elation when 
pressing a button caused a reaction which 
fitted into my preconceived ideas, but de 
termination when I saw the end in sight,” 
“interesting, required concentration all 
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CARDIAC AROUSAL DURING MENTAL ACTIVITY 


throughout.” Though the cardiac arousal and 
comments of intrigue, elation, stimulation, 
and excitement of the efficient subjects can be 
interpreted as indicative of a greater level of 
involvement, motivation, or a feeling of 
success, in many ways inefficient subjects 
seemed equally well motivated. Subjects had 
no idea of their relative degree of efficiency 
for no information was given them through 
which they could judge their performance 
other than having been told of the time limit 
and that this was a “reasonable amount of 
time.” On reaching solution, inefficient sub- 
jects frequently expressed great satisfaction at 
having solved the problem. In terms of the 
output of physical energy, inefficient subjects 
pressed more buttons at a much faster rate 
than the efficient subjects, and in this sense, 
worked harder. Inefficient subjects also 
tended to report a higher degree of experienced 
arousal or tension. 

Thus the difference in arousal patterns of 
eficient and inefficient subjects does not seem 
to be simply an issue of the degree of motiva- 
tion, but rather more one of the type of 
motivation. As suggested in their spontaneous 
comments, efficient subjects seemed freer from 
internal needs and pressures and were better 
able to attend to and appreciate the nuances 
and subtleties of the problem and to see it as 
an exciting and intriguing puzzle. Inefficient 
subjects seem to express in their comments a 
need to master, control, or impose their own 
preconceived structure on the problem rather 
than naturally following the leads that evolve 
from the problem. The playful exploration and 
feelings of discovery of the efficient subjects 
seem more conducive to an appreciation of the 
subtleties and complexities of the problem 
and to more efficient functioning. In earlier 
research (Blatt & Stein, 1959) efficiency on the 
PSI was found to have a highly significant 
positive correlation with the esthetic value of 
the Allport-Vernon-Lindzey Scale of Values. 
The esthetic value, with its emphasis on form 
and harmony, seems to typify this playful 
exploration and the freedom to attend to and 
appreciate the enjoyable and exciting aspects 
of the environment. 

The importance of playful exploration and 
the “conflict-free” exercise of ego functions 
has become a major focus of recent develop- 
ments in experimental psychology and in 
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personality theories. In developmental proc- 
esses the role of play is seen as a crucial 
aspect for the objective recognition of the 
environment. As Schachtel (1954) states, it is 


.. only thought which is sufficiently free from urgent 
needs or fears that can contemplate its objects fully 
and recognize it in relative independence from the 
thinker’s needs and fears—that is, as something ob- 
jective. In thinking about a problem one is usually 
successful only if one does not press too hard for solu- 
tion; that is, one is more likely to be successful if the 
thought is truly object-centered, free to contemplate 
the object from all sides, than if the thought is goal- 
centered, under the pressure of having to produce a 
solution immediately. 


It is this playful exploration, this exciting 
appreciation of the problem from all sides, 
and the intrinsic satisfaction in functioning, 
which seems to be the psychological counter- 
part of the adaptive arousal indicated by the 
elevations in cardiac rate during efficient 
problem solving. 


SUMMARY 


Concomitant recordings of heart rate were 
obtained from 18 male young adult subjects 
during complex problem solving on the John- 
Rimoldi PSI apparatus. The level of cardiac 
rate and its variability were compared for rela- 
tively efficient and inefficient subjects. Though 
the two groups of efficient and inefficient 
subjects were initially similar in cardiac 
patterns, there was a highly significant in- 
crease in cardiac rate and variability in the 
efficient subjects while they were attempting 
to solve the problems. These increases in 
cardiac rate and variability of cardiac rate of 
efficient subjects were significantly greater 
than their own initial resting baseline as well 
as being significantly greater than the changes 
in cardiac patterns of the inefficient subjects. 
The elevations of cardiac rate of efficient 
subjects occurred, in part, at crucial moments 
in the thought process: where necessary and 
sufficient information for solution was avail- 
able, where the subject’s predominant activity 
changed from analysis to synthesis, and at 
solution. 

These findings were discussed in terms of 
the role of autonomic arousal in efficient 
functioning and in terms of some of the possi- 
ble psychological or motivational counterparts. 
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PRAISE AND CENSURE AS MOTIVATING VARIABLES IN THE 
MOTOR BEHAVIOR AND LEARNING OF SCHIZOPHRENICS! 


ROY C. LONG? 


University of Texas 


T is now several decades since investigators 

established the performance deficit and 

learning impairment of schizophrenics in 
general (Babcock, 1933; Boring, 1913; Hull, 
1917; Kent, 1911). More recent experimental 
work, preceded by extensive clinical observa- 
tion, has also demonstrated the hypersensi- 
tivity of schizophrenics and the detrimental 
eect of censure upon their performance in a 
variety of tasks, whereas praise seems to evoke 
responses similar to those of normals (Bleke, 
1955; Garmezy, 1952; Hahn, 1956; Webb, 
1955). Olson (1958) found that while a negative 
experimental condition (representing social 
punishment) may not be deterimental to the 
performance of schizophrenics, it has less 
efect than a positive condition in causing them 
to better their performances. A heightened 
generalization gradient has been shown to char- 
acterize the schizophrenic patient (Dunn, 1954; 
Garmezy, 1952; Mednick, 1955), apparently 
linked to the learning impairment and reaction 
to censure. Similarly, Mednick (1958) has 
differentiated high drive schizophrenics 
(acutes) from low drive schizophrenics (chron- 
ics). 

It thus appears that acutes may be easily 
stimulated, particularly by censure, because 
ego defenses have not been tightened suffi- 
ciently at the early stage of the disorder to 
handle high drive associated with threat of dis- 
approval. Clinical observations have generally 
shown the converse to be true of the chronically 
ill. They are more rigid, with concretized de- 
fenses, and are usually quite difficult to arouse 
from their state of. seclusiveness. An attitude 
of disapproval from those in their environment 
has apparently taken its toll with a resultant 
lowered drive level. Fromm-Reichmann (1950) 
has ascribed the seclusiveness of the severe 


‘This paper is based on a doctoral dissertation 
submitted to the University of Texas. 

The author is indebted to his advisor, Ira Iscoe, for 
guidance and encouragement during the course of the 
study and to Ruth M. Hubbard and the staff members 
of the Veterans Administration Hospital, Waco, Texas, 
who aided in innumerable ways in the research project. 
. *Now at Veterans Administration Hospital, Waco, 
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schizophrenic patient to the wish to avoid 
“another rebuke in a long row of thwarting re- 
buffs which the schizophrenic has experienced 
in childhood and conditioned him to expect in 
repetition.” 

Several investigators (Farber & Spence, 
1953; Spence & Taylor, 1953; Taylor, 1956; 
Taylor & Spence, 1954) have tested predictions 
derived from learning theory about perform- 
ances in simple and complex tasks under differ- 
ing drive intensities. Interfering response tend- 
encies under high drive have been considered 
to impair efficiency on complex tasks, whereas 
high drive facilitates performance on simple 
conditioning tasks. The present svudy at- 
tempted to extend the assumptions of Hull 
(1943) about drive level to encompass the type 
of social situation (censure or praise) in which 
the schizophrenic typically manifests perform- 
ance deficit. 

The social condition of censure was con- 
sidered as a threatening, drive arousing situa- 
tion which further increases the individual’s 
general drive state. On the other hand, it was 
assumed that a praise condition could be reas- 
suring, modifying a subject’s need to achieve 
and decreasing his drive state. A neutral or 
nonevaluative condition would seem to have a 
motivational effect somewhere between that of 
praise and that of censure. The design provided 
tasks whereby a subject could, for example, 
demonstrate increased drive level on a second 
task, if he had been censured during the first 
task, by increasing his output or efficiency on 
the second task and gain a tangible reward 
(cigarettes). The effects of praise and of non- 
evaluation during the first task, both immedi- 
ate and in carrying over to the second task, 
could be determined in the same manner. 

The primary purpose of the study was to in- 
vestigate what effect the over-all drive state 
of the schizophrenic, produced through the 
interaction of initial drive level matched with 
social condition, has in influencing his gross 
output (operationally defined as the number of 
trials a subject took during a task). Major 
Hypotheses 1 and 2 pertained to this question, 
and secondary Hypotheses 3, 4, 5, and 6 were 
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concerned with what the gross output accom- 
plishes either in immediate change in per- 
formance on a simple discrimination task (A) 
or in efficiency on the second and more complex 
learning task (B), involving motor responses 
in a dominant-nondominant multiple-choice 
situation. (Efficiency on Task B was defined as 
the number of correct responses among all 
those which the subject made on the task.) 
The interaction between socially rewarding, 
neutral, or socially censuring stimuli and the 
original drive level of the individual seemed 
attainable by imposing one of the three types 
of social condition upon one of the two degrees 
of drive level, thereby creating a “set’’ under 
which the individual would work. Specific 
hypotheses are elaborated in the presentation 
of the results. 


METHOD 


Subjects. Seventy-two adult male patients from a 
Veterans Administration neuropsychiatric hospital 
served as subjects. Age ranged from 20-60. Thirty-six 
had been psychiatrically diagnosed as being in the acute 
(high drive) stage of schizophrenia; the remaining 36 
were classified as chronics (low drive). The usual sub- 
types of schizophrenia were included. No subjects with 
organic brain involvement or with signs of mental 
deficiency were used. All acute subjects were of similar 
age, intellectual function, educational attainment, and 
approximate length of illness; the same was true for 
chronics. It was not the intent of this study to match 
acutes with chronics but to investigate the distinct 
responsiveness of both groups. Matching would involve 
interference with the differentiating characteristics 
Twelve of the acutes and 12 of the chronics were 
randomly assigned to one of the three social condition 
subgroups (praise, neutrality, censure). Everyone was 
tested individually on both tasks. 

Task A. The first task was a visual-motor discrimina- 
tion situation, the primary purpose of which was to 
serve as a means of subjecting a subject to praise or 
censure. It required him to transfer colored marbles as 
fast as he could from a bin on the left-hand side of a 
table to appropriate holes in a panel set flush with the 
table surface on the right-hand side. A bin beneath the 
holed panel caught the inserted marbles, but only after 
they had interrupted an electronic beam. As each 
marble broke the beam, this activated a Veeder counter, 
and the total number of marbles inserted during the 
time period of 7 minutes was recorded. 

The counter reading at the end of the first minute 
of Task A represented the subject’s base rate. No 
censure or praise was given up to this time. In evaluat- 
ing later performance changes, the rate for the first 
minute was used as a base for comparison, though it 
was not included in the time period (second through 
seventh minute) during which the effect of the condi 
tions was intended to apply. A pilot study proved 
fatigue to be a negligible factor during this time interval 
Total performance scores, inclusive of base rate, were 
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computed for all six of the experimental groups so that 
differences between type of subject, as well as between 
conditions, could be given an over-all evaluation. This 
was also done for the base rates to see how closely 
acute subgroups, as well as chronic subgroups, were 
matched. Since information as to the effect of social 
censure or praise on performance rate in comparison 
with the base rate (whether high or low) was the 
primary aim, computation to supply this was done for 
each individual prior to statistical analysis. The method 
used to derive a measure of the increase or decrease in 
performance (mean number of marbles inserted per 
minute) is shown below: 


198.6 Number of marbles inserted during the 
entire 7 minutes 

26.5 Minus the Base Rate 

6)172.1 Total marbles inserted during second 

to seventh minutes 

28.7 Mean number of marbles/minute during 
second to seventh minutes 

26.5 Comparison with Base Rate 

+2.2 Change (increase) in number of marbles, 


minute 
(It can be seen that a negative number of marbles/ 
minute would indicate a decrease.) 

The general instructions given for Task A included 
a demonstration of how the colored marbles were 
inserted one at a time into a hole of matched color. 
When this testing period was ended, it was necessary 
only that the subject slide his chair slightly so as to 
move from the table and place himself before the panel- 
board of Task B, which was sitting just to the right 
of the table. 

Task B. The second piece of equipment consisted 
of four red, jeweled reflectors, illuminated by 7-watt 
pilot bulbs arranged horizontally on a panel. A small 
push button was situated 3.5” directly below each light. 
Illumination of the reflectors was made in random 
order, and each one stayed lighted for a period of 7 
seconds unless it was turned off by the subject’s pressing 
the correct button to extinguish it. If the correct button 
was not pressed, the reflector remained lighted for the 
full 7 seconds, followed by a 2-second interval during 
which none of the reflectors were lighted; then the next 
one in random order was turned on automatically 
through the use of appropriate electrical contacts on a 
revolving drum. {A chute and_transparent-topped 
receptacle was provided, into which a cigarette dropped 
whenever the subject extinguished a light by pressing 
the correct button. Since the subject could see this 
tangible reward being given for his effort, it was 
assumed that achievement need could be satisfied as he 
accumulated these valued objects, even though no 
verbal comment was made to him. 

In Task B, initially dominant (pre-experimentally 
acquired) position habits were arbitrarily designated 
as either correct or incorrect. Morin and Grant (1955) 
reported findings in line with the assumption that under 
conditions in which the various lights are successively 
activated, the initially dominant response is to the 
response element which is in direct spatial correspond 
ence, i.e., directly underneath. Half of the to-be-learned 
S-R combinations in the present situation (Numbers 
1 and 4) involved the initially dominant response, 
whereas the remaining half (Numbers 2 and 3) did not. 
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The effects of the statements (praise, neutrality, 
sod censure) used in the present study would be 
egected to interact with the particular S-R combina- 
fon involved. For those combinations in which the 
initially dominant response was the correct one, drive 
ysightening conditions should increase the efficiency 
{ performance, but should impair efficiency where the 
dominant is not the correct response. Drive reducing 
conditions should work conversely in facilitating per- 
formance efficiency for nondominant-correct response 
combinations more than for dominant-correct response 
combinations. 

Task B was designed not only to indicate the effi- 
cency of the subject’s performance in this alternate- 
choice motor-learning problem, but primarily to obtain 
a measure of drive level, so the number of trials taken 
was left up to him. The only verbalization required 
throughout the testing session was that he indicate 
when he was ready to quit and leave with the number 
of cigarettes he had won. The number of trials he took 
on the task was used as a measure of drive output. A 
secondary measure of learning efficiency was gained by 
stting an arbitrary criterion of eight correct responses 
ie, four dominant and four nondominant correct 
responses in a row). 

The instructions given for this task emphasized 
that the pushing of the “correct” key would bring a 
cigarette reward and that the subject should say when 
to stop. 

Experimental social conditions. The attempt to 
instill the attitude of success (praise) or failure (censure) 
regarding performance on Task A was done by making 
a statement to each subject after the first minute he 
worked on the task and again after each minute through 
the sixth. No comments were made at all during the 
neutral condition. After the seventh minute, the subject 
was stopped, and instructions were immediately given 
for Task B. 

The statements which were made to the subjects 
represented the only way they had of knowing whether 
their performance was considered good or bad; no 
reference point was provided. The following comments 
were used: 

Praise condition. ““That’s fine,” “You're doing 
very well,” “Very good,” and “You really know 
how to do this very well.” 

Censure condition. “You're not doing well,” 
“That’s poor,” “That isn’t good,” “You don’t know 
how to do this very well,” and “You aren’t very 
good at this.” 


The same statement was never given twice in succession. 


RESULTS 

Hypothesis 1. Acute (high drive) schizo- 
phrenics show greater output on Task B if they 
have been censured beforehand than if they 
have been praised or had nothing said to them. 

The acute group, as predicted, made signifi- 
cantly greater output (p > .01) on Task B fol- 
lowing censure on A, thus performing in a 
manner similar to that which previous studies 
haveshown for normals. The median chi square 
test was employed to cover the wide variability 
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TABLE 1 
Ovurtput on Task B 
Success Neutral Failure Total 
Acutes 
(Median = 35.0) 
Above median 6 2 10 18 
Below median 6 10 2 18 
Total 12 12 12 36 
df=2 x? = 10.67* 
Chronics 
(Median = 22.0) 
Above median 8 6 3 17 
Below median 4 6 9 19 
Total 12 12 12 36 
df = 2 x? = 4,33 





* Significant at .01 level. 


of subjects, and the distribution and values 
are shown in Table 1. 

Hypothesis 2. Because of low drive responsi- 
tivity, chronic schizophrenics cannot be moti- 
vated enough by censure to change Task B 
output for it to differ significantly from that 
following praise or silence. 

The initial drive level of the chronics proved 
to be too low, as predicted, for there to bea sig- 
nificant difference (p < .12) in their output on 
Task B according to whatever change in drive 
state the experimental social conditions might 
have provoked. The results were worth noting, 
despite not reaching significance, because the 
chronics seemed to increase output when 
praised or had nothing said to them and did 
little when censured. This was the reverse of 
how acutes responded. 

Hypothesis 3. Under censure conditions on 
Task A, acutes show significantly better im- 
mediate performance than under praise. 

Hypothesis 4. Chronic schizophrenics possess 
such an initially low drive that their immediate 
performance upon Task A is not affected differ- 
entially by any imposed social condition. 

The effect of censure, neutrality, or praise 
upon the immediate performance changes of 
either acutes or chronics on simple Task A 
was not significantly demonstrated, although 
acute subjects showed some disposition to 
behave in the predicted direction for acutes, 
and the same reverse order was noted for the 
chronics as was found with output on Task B. 

Hypothesis 5. Acutes’ efficiency on Task B is 
less following censure because of aroused ir- 
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rABLE 2 
PERCENTAGES FOR TYPE OF CORRECT 
RESPONSE ON TASK B 
Praise Neutral Censure 
Domi Noa- Domi- Non Domi von 
domi- domi dadomi- 
nant nant nant 
nant nant nant 
Acutes 47% 16°, 48° 11°; 41° 16% 
(63°%) (59%) (57%) 
Chronics 35% 4% 49% 0%| 43% 47, 
(39%) (49% (47% 
TABLE 3 
SoctaL DATA ON SUBJECTS 
Mean 
, level of =| Mean 
Cor Average intellec length of 
ditior yo tual fur llness 
laa tik 4 Vea 
10 S 
Acutes Praise 33.7 104.6 10.5 g.2 
Neutral 31.9 99.6 10.8 73 
Censure 31.3 100.( &.9 7.4 
Xa 32.3 101.4 10.1 7.6 
Chronics Praise 41.8 77.3 0.5 16.3 
Neutral 40.9 85.4 9.3 13.6 
Censure %.9 87.3 990 12.6 
Xp 39.8 83.4 9.6 14.2 
X, —Xp 7.5 18.0 0.5 6.6 
Significance of ¢ lp=.01| p= .01 p= .01 


relevant response tendencies than it is follow- 
ing a praise or neutral condition. 

It was considered that censure might help 
chronics slowly reach a more “optimum” drive 
state, i.e., somewhat more responsive but not 
such that numerous irrelevant response tend- 
encies overwhelm them. The next prediction 
was made to test this idea. 

Hypothesis 6. Following censure on Task A, 
chronics’ efficiency on B is greater than when 
neutral or praise conditions are employed. 

The experimental conditions did not signifi- 
cantly affect learning efficiency of acutes or 
chronics for Task B. This is shown in detail 
on Table 2. By reading across the rows, it may 
be seen that similar percentages of dominant- 
correct responses were made by acutes accord- 
ing to condition, and by chronics according to 
condition. There was also similarity in the per- 
centage ol 
made within these patient subgroups. 

An analysis of variance confirmed that there 
was a highly significant difference (p > 001) 


nondominant-correct responses 


—————————————————— 


Cc 


LONG 


in type of response (dominant vs. nondomi- 
nant) given by all subjects (F = 96.41; df = 
1). Both acutes and chronics were most efficient ) 
when a dominant response was the correct one 

Since the variables of age, IQ, length of jj. 
ness, and educational attainment had beep 
held constant for the three acute subgroups, 
as well as for the chronic subgroups, a sideline 
comparison between acutes and _ chronics 
proved interesting. Acutes were not found to be 
significantly superior to chronics on Task A, 
although their superior performance was ex- 
pected. On the other hand, 12 of the 36 acutes 
reached the arbitrary learning criterion on 
Task B of eight consecutive correct responses, 





whereas none of the chronics learned the task. ) 


Acutes also made a higher percentage of the 
more difficult nondominant correct responses. | 
Variance analysis showed not only that there 
was a difference between dominant and non- 
dominant response type (D), but also that 
acutes and chronics differed significantly (F = 
31.77; df = 1) in making these responses. In- 
spection of Table 2 reveals that the percentage 
of nondominant-correct responses was 16 for 
acutes under both censure and praise condi- 
tions in contrast to only 4 for chronics. Neutral 
acutes also were obviously better than neutral 
chronics by a difference of 10 percentage points. 
The social data collected on the subjects 
and compiled in Table 3 offers an explanation 
for the acutes’ superior performance on Task 
B. The differences in age, IQ, and length of 
illness between the acute and chronic groups 
are all significant at the .01 level. However, it | 
must be noted that there is very little differ- | 





ence in educational achievement between the } 


two types of subjects. This latter finding sug- 


gests that the premorbid intellectual potential } 


of all 72 subjects was quite similar, but that 
the seven more years of illness in the chronic 
group, mostly spent in a hospital setting, may 
account for the IQ differences. Such differences 


between the acutes and chronics apparently 


determined the differential efficiency of per- 


formance on the task. 
DISCUSSION 

It was shown that an attitude established 
on one task could differentially evidence itsel! 
by carrying over or 
task. This result is in line with previous re 
search which established the heightened get- 
eralization gradient as characteristic of the 
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PRAISE AND CENSURE WITH SCHIZOPHRENICS 


ghizophrenic individual. The output differ- 
ences for the acute subgroups were as expected 
from learning theory extensions. The marked 
increase under the failure condition can be 
explained by the upsurge in drive state of the 
subject and his efforts to handle the perceived 
failure and unsatisfied achievement need by 
putting out more work and thereby winning 
the reward of cigarettes. This did not occur 
when drive reducing conditions (praise and 
neutral) were employed. The individual’s drive 
state was apparently lowered and his achieve- 
ment need was satisfied either by the praise or, 
seemingly, even better by accepting the notion 
that “no news is good news,”’ i.e., the neutral 
condition. 

The prediction that chronic subjects could 
not be motivated enough by social condition to 
cause differential output on Task B was con- 
firmed. Nothing said to them resulted in any 
significant change. The trends obtained, while 
not significant (p < .12), were reversed to 
those presented by acutes and may warrant 
further investigation. 

With regard to the secondary area encom- 
passed by the study, there were also trends in 
the predicted direction for acutes as to their 
mean change in performance on Task A, as 
well as the reverse trends for the chronics, al- 
though statistical significance was again not 
reached. The marked variability of the sub- 
jects was clearly a cause for this. Failure to 
find differences in acutes’ or chronics’ efficiency 
of performance on Task B according to whether 
censure or praise was employed warrants more 
thought. Since there was a significant differ- 
ence between the type of response given, i.e., 
34 times more dominant than nondominant 
for the acutes and approximately 9-10 times 
more dominant than nondominant for the 
chronics, there is little doubt that the nondomi- 
hant connections were much more difficult for 
all subjects to make. Yet they were equally 
dificult under the three conditions. There- 
fore, there seemed to be no increase or de- 
crease in irrelevant response tendencies as a 
result of what was said to a subject, and the 
predictions made from theoretical extensions 
appear to be in error. 

Several things about this may be considered: 
It is indicated by the findings regarding output 
that the statements were having an effect on 
acute subjects. Perhaps the focus of attention 
on deciding whether or not to remain at the 


287 


task carried more weight than concentrating 
on the type of response. Also, the majority of 
the subjects gave a predominantly larger num- 
ber of dominant responses. When a dominant 
response was the correct one to make, as it was 
50% of the time, the subject was reinforced by 
winning a cigarette. If drive level could be re- 
duced and achievement need satisfied by win- 
ning a cigarette every second or even third 
trial, the motivation might be toward sitting 
through trials and not being overly concerned 
about nondominance failures. Drive state 
could be evidenced merely by output or num- 
ber of trials taken. In this connection, the fund 
of knowledge about the higher resistance to 
extinction of partially reinforced learning is 
apropos. Subjects were certain to make some 
of the easy dominant-correct responses. Domi- 
nants were correct half the time and were ap- 
parently continued because of the partial rein- 
forcement. This could have been operative in 
determining the number of trials taken as well. 
It does not provide a complete answer, how- 
ever, because the percentage of dominant-cor- 
rect responses was similar for all subgroups 
although there were significant differences in 
number of trials taken by acutes. 

The over-all answer may be that neither 
neutrality nor a statement of praise lowered 
the drive state of the acute enough for him to 
overcome the irrelevant response tendencies 
and “solve” the task better than when he was 
censured, whereas the chronics’ drive state 
was never increased to the point where he 
would make other than the easy response, and 
his lowered intellectual efficiency prevented 
his learning the more difficult one. 


SUMMARY 


In this study concerning the motivation of 
schizophrenics, Hullian assumptions regard- 
ing drive level were extended to encompass 
social situations in which schizophrenics have 
been shown to manifest performance changes, 
and to differentiate acute schizophrenics from 
chronic schizophrenics on the basis of high vs. 
low drive level. 

Three experimental social conditions, drive 
arousing censure, drive decreasing praise, and 
a neutral (silent) condition were used with 72 
adult male schizophrenics. Thirty-six acutes 
were of similar intellectual function and ap- 
proximately equal in age, educational attain- 
ment, and length of illness. Thirty-six chronics 
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were also intermatched on these variables. The 
design of the study permitted a subject to 
demonstrate changes in drive level on a motor 
learning task (B) according to whether he had 
been censured or praised previously on a sim- 
ple psychomotor task (A). 

The generalizing effect of the social condi- 
tions (censure, neutral, praise) from one task 
to the other was found. 

The acute group demonstrated significantly 
(p > .01) greater output on Task B following 
censure on Task A than following praise or neu- 
tral conditions on Task A, and as had been 
hypothesized, performed more as previous 
studies have shown typical for normals. 

As predicted, the initial low drive chronics 
could not be motivated to show significantly 
different response (p < .12) to the social situa- 
tions. 

The effect of censure, neutrality, or praise 
upon the learning efficiency on Task B of both 
acutes and chronics or upon their immediate 
performance during Task A was not signifi- 
cantly demonstrated. 

Acutes were found to be significantly su- 
perior to chronics only on Task B. This could 
be attributed to the acutes’ youth, higher in- 
tellectual function, and to their shorter length 
of illness, which was half that of chronics. 
Educational attainment was almost identical 
for acutes and chronics, however, suggesting 
that their premorbid mental potential was 
nearly equal. 

In general, the primary thesis of the study 
regarding the effect upon the effort or gross 
output made by schizophrenics on a task, fol- 
lowing previous treatment of drive level by 
censure or praise, was supported. The sec- 
ondary hypotheses dealing with the efficiency 
of this output could not be supported. The re- 
sults suggested some genuine differential moti- 
vating effects of the social conditions and gave 
support to the employment of the drive level 
concept in further study with pathological 
groups. 


Roy C. Lone 
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WORD MEANING AND SEXUAL IDENTIFICATION IN PARANOID 
SCHIZOPHRENICS AND ANXIETY NEUROTICS' 


MARVIN S. BEITNER 


Garden Grove, California 


scoop and Luria (1954) reported “A 
Blind Analysis of a Multiple Person- 
ality” in this journal. They used the 
Semantic Differential as a clinical test and ob- 
tained provocative results. Since the findings 
were based upon a single clinical case, however, 
the authors were limited to a discussion of in- 
teresting observations and speculations about 
the patient’s personality dynamics. The pres- 
ent study takes advantage of the clinical ap- 
plicability of the same instrument, but expands 
the field of observation to 120 subjects. Its 
purpose is to test some commonly accepted 
hypotheses about the role of identification and 
the nature of word meaning in psycho- 
pathology. 

According to psychoanalytic theory one of 
the crucial developmental steps in the forma- 
tion of personality is the identification of the 
child with his like-sexed parent. The father is 
regarded as a “behavior model” whom the 
healthy boy uses for a self-pattern. The 
paranoid schizophrenic is supposed to have 
failed to use his father as a model for identifica- 
tion and to tend to identify too closely with 
his mother. Other hypotheses about sexual 


‘Sections of this study were originally presented as 
papers and as part of a symposium at the Western 
Psychological Association meetings in 1959 and 1960. 
It is based in part upon a dissertation submitted to the 
University of California at Los Angeles in 1958 in 
partial fulfillment of the requirements for the PhD 
degree and in part upon subsequent work. The author 
is indebted to Roy M. Dorcus and J. A. Gengerelli 
for their valuable critique of the research design and 
to the psychology and medical staffs of the Veterans 
Administration Mental Hygiene Clinic at Los Angeles 
and the Veterans Administration Hospitals at Los 
Angeles, Long Beach, San Fernando, and Sepulveda, 
California, for their cooperation in providing subjects 
for this study. The aid of Mortimer Meyer, Harry 
Grayson, John Schlosser, Elston Hooper, Barbara 
Stewart, and Harold Giedt was indispensable. The 
author is similarly indebted to the management and 
employees of nine anonymous Los Angeles business 
irms who provided subjects for one control group at 
company expense. Finally the author is grateful to the 
staff of the Western Data Processing Center at Los 
Angeles for the use of their 650 Electronic Data Proc- 
essing System and to Richard Johnson and William 
Irion for their help in translating the data into the lan- 
guage of the machines. 


identification in paranoid schizophrenia sup- 
pose, on the contrary, a general lack of identi- 
fication with both parental figures. Some of 
these hypotheses about pathological sexual 
identification are tested in this study. A second 
series of hypotheses explores the variability of 
word meaning as related to psychopathology. 


METHOD 


Instrument. A form of the Semantic Differential was 
used as a measure of identification. The intimate rela- 
tionship between word meaning, Semantic Differential 
responses, and identification has been discussed by 
Lazowick (1955) who provides empirical evidence sup- 
porting the validity of the Semantic Differential as a 
measure of identification. Three of the Semantic Differ- 
ential scales used in the study were selected from 
among Osgood’s (Osgood & Suci, 1955) three main 
factors of meaning: good-bad, strong-weak, active- 
passive. The other seven scales were chosen for their 
particular relevance to personal adjustment and inter- 
personal relationships: personal-impersonal, masculine- 
feminine, carefree-careful, firm-flexible, proud-humble, 
familiar-unfamiliar, solitary-sociable. Thirty stimulus 
words were rated on this form of the test by each sub- 
ject. Among these stimuli were included such words as 
ME, FATHER, and MOTHER, which were used to build 
models of hypotheses about identification. 

Subjects. Subjects were chosen from paranoid schizo- 
phrenic and anxiety neurotic populations since problems 
of identification have been assumed to play important 
roles in both types of disturbance. The first experi- 
mental group consisted of 30 hospitalized paranoid 
schizophrenics. These and all other subjects were 
Caucasian males between the ages of 18 and 48. Age, 
education, and occupational level were controlled. The 
first control group consisted of 30 hospitalized tuber- 
culosis patients, chosen because tuberculosis patients 
are one of the few groups of chronically hospitalized 
patients other than psychiatric patients. Tuberculosis 
patients have been observed to be somewhat immature 
and dependent; however, this observation seems to be 
more a function of their prolonged hospitalization than 
of prior personality configuration (Dorken, 1951; Ellis 
& Brown, 1950; Hand, 1952; Seidenfeld, 1944; Stanton, 
1939; Stewart & Vineberg, 1955; Wechsberg & Sparer, 
1948). This is exactly the control effect desired in this 
study. Psychological test reports, interviews, and social 
history information were used to rule out obviously 
emotionally disturbed tuberculosis patients. In addition 
patients who had multiple tuberculosis hospitalizations 
were eliminated since this implies an immature inability 
to follow simple medical directions (Stewart & 
Vineberg, 1955). 

The second experimental group was composed of 
30 anxiety neurotic patients from an outpatient 
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Veterans Administration mental hygiene clinic. Control 
t 


subjects for this group were selected from among the 
r mployees of several Los Angeles business firms. Since 
these subjects were selected from the community with 
out the convenient (but biasing) recruiting efiect of 
university or hospital affiliation, the use of clinical 
interviews or testing as a method of screening was not 
feasible. The testing was done on employer time on the 
business or plant premises. Anonymous reports by 
the control subjects that they had never had psychiatric 
treatment or hospitalization were used as the only 
safeguard against the presence of emotional disturbance 
Any contamination of this control group with neurotic 
subjects should militate finding significant 
difierences in comparison with the neurotic group 

Measurement. Semantic Differential data have been 
conceived as generating profiles characteristic of the 

\ given profile 
multidimensional 


against 


meaning of the rated stimulus words 
determines a single point in a 

semantic space. The nature of these dimensions depends 
upon the bipolar scales used in constructing the test. 
Since all stimulus words are rated on the same set of 
scales the distance between two points in the same 
multidimensional space provides an index of the simi 
larity or difference of meaning of the two stimulus 
words. The following formula permits the calculation 
of the distance between any pair of points in semantic 


space. 
D= V/ (Xi Y,)? [1] 
Where: D = distance 
XY, = the scalar score for a given stimulus word 
on the ith scale 
Y, = the scalar score for a second stimulus word 
on the ith scale. 


Actually a more efficient formula involves the use 
of the inverse of the covariance matrices of the differ 
ences between scales (XY; — Y,;) and (X; — Y;). This 
expression corresponds to the exponent of a multivariate 
normal frequency function for correlated variables. 
Since the actual correlation between scales in this study 
is low (algebraic mean = .21), only a negligible decrease 
in efficiency results from the use of the simpler distance 
measure given in the formula. 

Hypotheses. A special series of validity hypotheses 
are included to insure the general validity of the test 
instrument for these groups and to determine whether 
the psychotic subjects were able to understand the test. 
Each of five validity hypotheses predicted that a pair 
of words similar in meaning would generate semantic 
profiles that would be more similar than a second pair 
of words opposed in meaning. For example, Hypothesis 
I predicts that the words MAN and FATHER should 
generate profiles more similar than are generated by 
the words HERO and FAILURE (see Table 1) 

The next series of word models tests the hypothesis 
that idiosyncratic word meaning is an integral part of 
psychopathology. The experimental groups are expected 
to agree less than the control groups as to the meaning 
of five words randomly selected from among the 30 
concepts used. Variability 
ured in terms of the distance between a subject’s 
profile for a given word and the mean group profile for 
that word. For a mean profile for the 
CHILD was calculated for the paranoid schizophrenic 


in word meaning was meas 


exampic word 





BEITNER 
rABLE 1 

VaLipity HyPoTHEsEs 

Results of sign te 
Word pairs 

Plus Minus 
LD (MAN, FATHER —D (HERO, FAILURE 16 104° 
D (IT, ANGER D (LOVE, KILI 14 106° 
D (sin, GurILT) D (ANGER, CHILD) 41 79° 
D (Love, Kiss —D (KILL, TREE 14 106° 
D (HERO, sUCCESS D (PATHER, FEMALE) 25 gs* 

Note.—D stands for ‘‘the distance between . . 


* Significant at or beyond .05 level. 


group and the distance was measured between this 
mean profile and the individual profiles on the word 
CHILD for each paranoid schizophrenic subject. This 
hypothesis is relevant to the clinical observation that 
communication with schizophrenic patients is very 
difficult; it tests the inference that this difficulty is 
related to a lack of agreement on the very meaning of 
words. It is also relevant to Johnson’s exposition of the 
“weneral semanticists” view that both neurotic and 
psychotic disorders are due to disturbances in word 
meaning (Hayakawa, 1949; Johnson, 1946). 

In Table 2 the exact form of the hypotheses about 
identification appear. Hypothesis I may be read as 
follows: When the semantic distance between the points 
representing the words ME and MOTHER is subtracted 
from the corresponding distance between ME and 
FATHER, a larger algebraic difference should result for 
the experimental subjects than for the control subjects 
That is, for each control subject the words ME and 
FATHER should have many common elements of meaning 
as a result of a healthy masculine identification, and 
therefore should determine points relatively adjacent 
in semantic space. On the other hand we expect a 
rather distinct differentiation in meanings between 
the words ME and MOTHER for a control subject. Accord- 
ing to the hypothesis that confusion in sexual identifica- 
tion is an essential feature of the process resulting in 
paranoid schizophrenia, the results of these measures 
should be significantly different from those of the 
paranoid schizophrenic subjects. 

Hypothesis II is the operational form of the hypoth- 
esis that the basic problem in paranoid schizophrenia 
is a lack of identification with both parental figures 
Thus the distance between the words ME and FATHER 
is added to the distance between the words ME an¢ 
MOTHER, the resulting sum being hypothetically smaller 
for the control groups than for the experimental groups 
Hypothesis III tests a variation of the first hypothesis 
about sexual identification. Without tapping parental 
identification directly measurements were obtained 
which are relevant to sexual identification by using 
the words ME, MAN and ME, FEMALE.? Hypothesis I\ 
2 A more logically satisfying model of this hypothesis 
might have used the antonyms (MALE, FEMALE) 0 
MAN, WOMAN). These pairs of stimulus words were not 


available in this study because it is part of a larger stu 
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in which the number of stimulus words was limited 
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TABLE 2 
PARENTAL AND/OR SEXUAL IDENTIFICATION: CONCEPT OF PARENTS 





Sum of ranks, Mann-Whitney U tests 


Word pairs Predicted difference 
Paranoid Control Anxiety — 
schizophrenic TB) neurotic ane 
I D (ME, FATHER E>C 871 959 1050 780* 
—D (ME, MOTHER) 
Il D (ME, FATHER) E> 1062 768* 1030 800* 
+D (ME, MOTHER 
ll D (ME, FEMALE) C>E 921 909 739 1091* 
—D (ME, MAN) 
IV D (ME, MAN E>C 986 844 1084 746* 
V D (MOTHER, SIN CoE 939 891 914 916 
—D (MOTHER, LOVE) 
VI D (FATHER, . SIN CoE 942 888 771 1059* 


—D (FATHER, LOVE) 


* Where E stands for experimental groups and C stands for control groups. Thus for Hypothesis I it is predicted that the in- 


dicated difference between distances will be greater for subjects in the experimental group than for those in the control group. 


* Significant at or beyond .05 level 


predicts a smaller distance between the words ME and 
wan for the control than for the experimental groups, 
thus examining the identification of the subjects as 
“a man” (with all of its implications about adequacy, 
etc.) independent of feminine identification. 

Hypothesis V deals with the subject’s picture of his 
mother in relation to a highly positive word (LOVE) 
and a highly negative word (sin). The control groups 
are expected to obtain significantly larger scores on this 
measure, on the basis of the commonly accepted clinical 
assumption of the mother’s key role in the emotional 
development of the child. Control subjects should 
locate MOTHER and LOVE relatively adjacently in 
semantic space but MOTHER and SIN at a relatively great 
distance. Since, as in the other hypotheses, the subject 
does not know that these particular stimulus words 
are being compared, the measurements are not deter- 
mined by deliberate, conscious statements about the 
rdationship between the stimulus words. Hypothesis 
VI is identical to V except that the key stimulus word 
is FATHER. It tests the importance of a favorable father 
image to the emotional development of the child. 


RESULTS AND DISCUSSION 


The data were punched into IBM cards and 
thedistances between words were computed on 
a2 IRM 650 Electronic Data Processing 
System and auxiliary equipment. The results 
lor each hypothesis consisted of 30 distance 
measures per group. Since the direction of 
differences was predicted a one-tailed Mann- 
Whitney U test served as a test of significance. 
In the case of the validity hypotheses a simple 
sign test was used on the pooled data of all 
lour groups. 


The meaningfulness of all other hypotheses 
depends upon the results of the validity hy- 
potheses as reported in Table 1. All of the 
differences are significant. Since a special com- 
parison of the validity measures for the 
paranoid schizophrenic group and its control 
group revealed no significant difference on any 
of the five validity subhypotheses, we may 
conclude that the subjects were able to under- 
stand the task. 

The next set of hypotheses concerns the de- 
gree of agreement or idiosyncrasy of word 
meaning as measured by variability about 
mean profiles. The results indicate a significant 
difference between the paranoid schizophrenic 
group and its control group for four of the five 
words tested (see Table 3). There are also sig- 
nificant differences between the anxiety neu- 
rotic group and its control group for two of the 
five words tested. In every case the differences 
are in the predicted direction. 

These results support the clinical impression 
that the peculiar use of language by paranoid 
schizophrenic patients is at least partly due to 
a disturbance in word meaning itself. In view 
of the results of the validity hypotheses, this 
distortion evidently occurs even when the 
paranoid schizophrenic patient has a gener- 
ally accurate understanding of a word’s mean- 
ing: even when the language of the paranoid 
schizophrenic is apparently coherent, there evi- 








29? Marvin S. BEITNER 


rABLE 3 
VARIABILITY OR IDIOSYNCRASY OF 
Worp MEANING 


Sum of ranks, Mann-Whitney U test 


Word romans — eee , 
schizo : : Control 
: rB neurotic 
phrenic 
CHILD 1037 793* 875 955 
PENIS 991 839 861 969 
ANGER 1040 790* 808 962 
MOTHER 1027 803* 1079 751* 
ME 1039 791* 1092 738* 


* Significant at or beyond .05 level. 


dently remain hidden blocks to communication. 
For example, the results indicate that “‘normal”’ 
subjects agree fairly well on the subtle, con- 
notative aspects of the meaning of the word 
ANGER. Paranoid schizophrenic subjects, on 
the other hand, agree less on the connotative 
meanings of the word ANGER and may, in fact, 
have very different conceptions of “ANGER” 
among themselves. One would then predict 
that in an apparently rational conversation 
with a paranoid schizophrenic there may sud- 
denly come a startling point of disagreement 
due to these subtle differences in word meaning. 
For example, paranoid schizophrenics might 
vary in their understanding of ANGER from such 
concepts as “mild, inoffensive irritation’ to 
“world destructive, murderous rage,” whereas 
normal subjects would agree more specifically 
on a meaning somewhere between these ex- 
tremes. 

Some disagreement as to word meaning is re- 
flected in the case of the anxiety neurotic 
group, but the effect is apparently less general. 
We would thus expect some misunderstanding 
and blocking in communication with anxiety 
neurotic patients, but on a less widespread 
basis. These experimental results perhaps re- 
flect the irrationality of neurotic patients in 
certain sensitive conflict areas whereas mis- 
understandings occur in almost any com- 
munication with paranoid schizophrenics. 

The results of the hypotheses involving 
identification and the parental picture are re- 
ported in Table 2. For the paranoid schizo- 
phrenic group, only one of the six hypotheses is 
supported at a significant level. For the anxiety 
neurotic group, five of the six hypotheses are 
significantly supported. These results give no 
support for the hypothesis that the paranoid 
schizophrenic tends to identify with the wrong 
parental or sexual figure (Hypotheses I and 


III). Nor is there evidence to suggest a lack of 


identification as a “man” (Hypothesis IV). 
whatever combination of sexual, virile, and 
generic connotations this term may have. The 
only significant difference between the paranoid 
schizophrenic group and its control group ap- 
peared in the extent of combined identification 
with both parental figures (Hypothesis II), 
These results tend to support the position that 
the paranoid schizophrenic, like other schizo- 
phrenics, is extremely isolated from other 
people, that he cannot empathize with them or 
understand them, all of which perhaps roots 
from an original failure to identify with parents 
of either sex. 

The neatness of this explanation is disturbed, 
however, when the results for the neurotic 
group are considered. For the anxiety neurotic 
group, as compared with its control group, five 
of the six hypotheses are supported at a signifi- 
cant level. Included among these is the hy- 
pothesis (II) about combined identification 
with both parental figures. Thus it is found 
that the kind of “distancing” or “estrange- 
ment” from parental figures that is implied by 
this measure is found in anxiety neurotic as 
well as paranoid schizophrenic patients. How- 
ever, both of the hypotheses about confused 
sexual identification (or identification with the 
wrong figure) are significantly supported for 
the anxiety neurotic group although they 
were not supported for the paranoid schizo- 
phrenics. Thus the results suggest that the 
concept of “confusion in sexual identification” 
is consistently and meaningfully associated 
with anxiety neurosis, rather than the common 
clinical assumption that confusion in sexual 
identification, or an identification with the 
opposite sexed parent constitutes an essential 
etiological factor in paranoid schizophrenia. 
The evidence for confusion in sexual identifica- 
tion for the anxiety neurotic group exists with 
respect to both Hypothesis I (parental models) 
and III (more general expression of sexual 
role). 

Finally the word FATHER is placed closer to 
stn than to Love for the anxiety neurotics as 
compared with their control subjects. This indi- 
cates a more negative, less loving picture of the 
father figure for the neurotic group. No such 
negative image seems to surround the mother 
figure as tested in Hypothesis V. It will be re 
membered that all subjects in this study were 
males. 
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Worp MEANING AND SEXUAL IDENTIFICATION IN PARANOIDS 


The results of this study can be compared 
with work done by Luria and by Lazowick in- 
volving the role of identification in neurotic 
conditions. Different forms of the Semantic 
Differential were used in these studies and 
different measurement operations and subject 
populations were used, but these differences 
provide an acid test of the breadth, consistency, 
and generalizability of findings. Luria tested 
“neurotic” college students (Osgood, Suci, & 
Tannenbaum, 1957) and found no general tend- 
ency among a mixed group of male and female 
students to identify with the parent of the 
wrong sex. Although this is contrary to the 
present findings, a comparable study by Lazo- 
wick (1955) indicates the probable reason for 
this difference. He found that “neurotic” male 
students (as defined by high scores on the 
Taylor Manifest Anxiety scale) perceived sig- 
nificantly more semantic similarity between the 
words (FATHER, MYSELF) as compared to 
(MOTHER, MYSELF) but the reverse did not hold 
true for the female subjects. He discussed these 
findings in terms of the relatively masculine 
goals of college women. Since the present study 
included only male subjects, its results are con- 
sistent with the study of Lazowick and the 
negative findings of Luria may well be due to 
the presence of female college students in the 
sample. The inclusion of the female subjects 
may also explain the fact that Luria’s analysis 
of the responses on the evaluative scales for 
the words MOTHER and FATHER indicated that 
the neurotic group tended to “vilify” both 
parental figures as hypothesized by Mowrer 
(1953). This is in agreement with the results 
of Hypothesis V (involving the words,MOTHER, 
SIN, LOVE) but is not in agreement with the 
results of Hypothesis VI (involving the words 
FATHER, SIN, LOVE). This inconsistency may 
also result from the inclusion of female sub- 
jects in Luria’s study as opposed to the exclu- 
sively male population used in the present one. 


SUMMARY 


A form of the Semantic Differential was used 
to study differences between hospitalized 
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paranoid schizophrenic subjects and a control 
group, and anxiety neurotic subjects and a 
second control group. Widespread areas of 
variability in word meaning for paranoid 
schizophrenic subjects, and limited areas of 
variability in word meaning for anxiety neu- 
rotic subjects were found. The paranoid schizo- 
phrenic group had a generally poor identifica- 
tion with both parental figures, as did the 
anxiety neurotic group. Evidence was found 
for “confusion in sexual identification” for the 
anxiety neurotic group but not, as is commonly 
assumed, for the paranoid schizophrenic group. 
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SOME EFFECTS OF “SUSPICIOUS” VERSUS “TRUSTING” 
TRAINING SCHEDULES! 
HAROLD H. KELLEY ano KENNETH RING 


University of Minnesota 


N order to teach another person to do 
something, a trainer often uses his ability 
to affect the rewards and punishments 

that person receives. For example, to teach 
her child to wash his hands before each meal, 
a mother may selectively administer rewards 
and punishments, rewarding the child when 
he washes and/or punishing him when he fails 
to do so. In the terms proposed by Thibaut 
and Kelley (1959), the mother has fate 
control over the child—the ability to affect 
his outcomes, his fate, regardless of anything 
he might do. Through selective use of this 
fate control, providing better outcomes when 
the child produces the desired behavior and 
poorer outcomes when he exhibits nondesired 
behavior, behavioral preferences on the part 
of the child can be created. Thibaut and 
Kelley describe this as converting fate control 
to behavior control. 

In the conversion of fate control, the problem 
of monitoring the trainee’s behavior is a 
critical one. If the trainer is to teach the 
trainee to produce a given behavior, it is 
desirable that he consistently reward that 
behavior and, speaking relatively, punish other 
behaviors. Without such consistency, the 
trainee cannot develop a clear idea of what is 
expected of him and will show imperfect 
adherence to the trainer’s prescriptions. 
Trainer consistency requires that he know on 
each behavior occasion what the trainee has 
done so that he may adjust his own behavior 
accordingly. However, it is often true that the 
trainer is unable to monitor all of the occasions 
on which the trainee makes relevant behavior 
choices. Trainers usually have other activities 
and duties that interfere with the maintenance 
of surveillance over any given trainee. On the 
other hand, a trainer need not depend entirely 
upon his own efforts in this respect. It is 
possible for the trainee to contribute to the 
monitoring by bringing his actions to the 

1 This research was supported by the Laboratory for 
Research in Social Relations and by Grant G-5553 to 
the senior author from the National Science Founda- 


tion. We are greatly indebted to John Hatton for his 
assistance in carrying out the experiment. 
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attention of the trainer and by presenting 
evidence as to his most recent behavioral 
choices. 

The trainee, then, can contribute to his 
own surveillance and thereby increase the 
consistency of the training he receives and his 
mastery of the trainer’s demands. Once we 
recognize this possibility, it becomes important 
to ask what factors in the trainer’s behavior 
affect the degree to which the trainee aids and 
abets the monitoring process. The general 
hypothesis in the present experiment is that 
an important factor in this respect is the 
action the trainer typically takes in the absence 
of monitoring evidence. 

Let us consider those occasions when the 
trainer knows the trainee has been confronted 
with a pertinent behavior choice but does not 
know what choice he has made. The trainer 
has available several courses of action: he can 
assume the trainee made the correct choice 
and reward him, he can assume he made the 
incorrect choice and punish him, or he can 
disregard the situation and take no action. 
The first course of action is likely to encourage 
concealment of behavior. As long as the 
trainee can obtain reward if the trainer fails to 
see his actions, he will have no reason to 
learn to show them to the trainer. He can 
obtain positive outcomes by making either the 
correct response publicly or the wrong response 
privately. (It is assumed here that learning 
the discrimination desired by the trainer has 
no intrinsic value for the trainee, but serves 
only to help him accommodate to the demands 
of the trainer. This is especially likely to be 
true in situations where the trainee has not 
voluntarily engaged himself in the relationship 
with the trainer.) On the other hand, while 
the second approach may seem unnecessarily 
punitive and rather cruel, it is likely to have 
advantages from the point of view of encourag- 
ing the trainee to bring his behavior to the 
trainer’s attention. The punishment would 
tend to discourage any tendencies toward 
concealment the trainee might have. The 
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“Suspicious” versus “TRUSTING” TRAINING SCHEDULES 


obtained) with certainty only by making the 
correct choice and presenting evidence of it to 
the trainer. The third course of action is 
intermediate in these respects. The trainer’s 
inaction encourages neither hiding nor showing. 
However, insofar as the trainee finds the 
incorrect behavior intrinsically more rewarding 
(and this is true in many of the more important 
training situations), the trainee will find 
concealment to be advantageous. Under these 
conditions, the trainer’s inaction in the 
absence of monitoring information would be 
expected to have effects similar to giving 
reward. 

The present experiment compares the first 
two of these courses of action. They are 
referred to, respectively, as a trusting schedule 
and a suspicious schedule. We propose that in 
the absence of knowledge of the trainee’s 
action, the trainer who adopts a training 
schedule based on a suspicious attitude will 
be more successful than one who adopts a 
schedule based on a trusting attitude. Specifi- 
cally, it is suggested that the suspicious 
schedule will encourage the trainee to aid the 
monitoring process to a greater extent than 
will the trusting one. In consequence, the 
trainee will tend to learn the trainer’s demands 
better under suspicious treatment and will 
eventually feel he has greater mastery over 
his outcomes in the relationship than if he had 
been treated with trust. The details of these 
hypothesized effects will be elaborated after 
the experimental procedure has been described. 


METHOD 
Subjects 


The 52 male subjects in the experimental groups and 
the 41 males in the control group were all under- 
graduates enrolled in introductory psychology courses 
at the University of Minnesota. 


Procedure 


Each subject found himself in a room with the ex- 
perimenter and another “subject” who was actually a 
confederate of the experimenter. The confederate was 
represented as being an advanced student in clinical 
psychology. They were told that the experimenter was 
interested in studying the effect of opinions of a quali- 
fied person (the clinical student) on the performance of 
42 unqualified person (the subject). Working inde- 
pendently, they would be given the same series of 
cards on each of which would appear two statements 
describing psychological symptoms. Each man was to 
decide which statement of each pair “indicates the 
greater psychopathology, that is, psychological illness.” 
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They were further instructed to evolve during the 
course of this task “a reasonable, consistent, and justi- 
fiable criterion of psychopathology” to use in judging 
each pair of items. In this process, the subject would 
have available the evaluations of his judgments made 
by the more expert person and the experimenter was 
presumably interested in how the subject would make 
use of these evaluations. 

The items themselves consisted of statements from 
the MMPI and variants thereof, selected on the basis 
of whether they reflected depression (including its 
effects) or anxiety (and its effects). Data were gathered 
prior to the experiment to determine the degree to which 
items conveyed either the one or the other attribute and 
a list of 37 pairs was then drawn up. Each pair always 
consisted of one depression item and one anxiety item. 
For example, on the third card in the series appeared 
the two statements, “I become pretty depressed when I 
fail at something important” and “I have nightmares 
every few nights.’’ On the twentieth card were the two 
statements “Almost every day something happens to 
frighten me” and “My life has turned out to be more 
wretched than I ever expected.” 

The subject was asked to go through the stack of 
cards, make a choice on each one as to which statement 
of the pair manifested the greater psychopathology, and 
register that choice by turning one of two switches. 
This decision of his, he was told, would appear on a light 
panel next door (where the confederate would ostensibly 
be making his selections) and would be evaluated by the 
“qualified student.”” Then the crucial instructions con- 
cerning the subject’s showing or concealing his choices 
were given. 

It was explained that the subject had the option of 
using the graduate student’s evaluations in arriving at 
his own criterion. If the subject wanted to insure 
that the “qualified person’”’ saw his choice on a given 
trial (and since the latter had some vague “additional 
functions” to attend to in the adjoining room, he might 
not always be in a position te observe the panel), he was 
to pull a switch which would “hold” his response on the 
panel until it was sure to be seen. On the other hand, if 
the subject was not particularly eager that his choice 
be seen, he was told to pull another switch which was 
said to terminate his panel light immediately. For each 
trial, then, the subject had fwo decisions to make: 
choosing the item that represented the greater psycho- 
pathology and keeping on or terminating the indication 
of his first choice on the panel in the next room. 

The experimenter further stated that the graduate 
student would be compelled to make some evaluative 
response to each choice of the subject, whether or not he 
had actually seen the subject’s choice on a trial. If the 
qualified person approved of the subject’s choice, he 
would flash on a green light on the subject’s panel. If he 
disapproved, he would give a red light followed by an 
electric shock. (Two brief sample shocks were given the 
subject just before the experiment began.) A neutral re- 
sponse would be indicated by a white light. The subject 
could choose to work with the qualified person (by 
“showing”’ his responses) or independently (by termi- 
nating his first choice by pulling a second switch) al- 
though he did not always have to work with or inde- 
pendently of the qualified person. The subject could 
use the latter’s evaluations to help in formulating his 
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own criterion of psychopathology or he could disre- 
gard them. The subject could change his criterion at any 
time. His job, he was told (repeatedly), was simply to 
arrive at the best criterion of psychopathology that he 
could. Perhaps the qualified person’s evaluations would 
aid him, perhaps he would do better by himself. After 
answering any questions the subject had, the experi- 
menter took the confederate next door and the experi- 
ment began. 


Manipulation of Trainer’s Schedule 


The trainer (confederate) had a pane! on which signal 
lights indicated the subject’s choice of item and the 
subject's decision to “show”’ or “hide” his choice. Using 
this information, the confederate delivered rewards 
(green lights) or punishments (red light and electric 
shock).? He followed one of two prescribed training 
schedules that may best be summarized by the two 
matrices shown in Figure 1. In these matrices, the sub- 
ject’s possible responses are indicated along the left 
and the trainer’s possible responses, across the top. 
“Right” stands for the choice the trainer regarded as 
correct and “wrong,” for the response he considered in- 
correct. Show means that the trainee was willing to 
make sure the trainer saw his choice and hide, that he 
wished to conceal his choice. In the body of the matrix, 
the pluses and minuses refer to the rewards and punish- 
ments dispensed by the trainer (or, the outcomes re- 
ceived by the trainee) and the number in the upper 
right hand portion of each cell indicates the probability 
that the subject would be given the reward or punish- 
ment associated with that cell. 

As these matrices show, the two schedules are identi- 
cal when choices are called to the trainer’s attention. In 
both cases, if the subject does the right thing and shows 
it to the trainer, he is rewarded 100% of the time; if he 
does the wrong thing and shows it, he is punished 100% 
of the time. 


? The white “neutral” light was never used. The shock 
delivered ranged from 17 volts on early trials to 30 
volts on later ones. The alternating current was gen- 
erated by a high internal impedence apparatus de- 
signed by David Lykken to whom we are indebted for 
its use. 


Haro_tp H. KELLEY AND KENNETH RING 


The schedules differ with respect to the assumption 
made in the absence of monitoring. The “trusting” 
trainer always assumes the best. Unless he has evidence 
to the contrary, as far as he is concerned the subject 
has performed correctly. Because differences in trainer 
schedules are important when the trainer’s ability to 
monitor is low, in both conditions we have instituted a 
20% monitoring rate which means that the trainer 
will be able to observe what the subject has done on only 
20% of the trials on which the subject attempts to hide. 
Of course, the confederate actually knows what the 
subject has chosen every time but he follows a program 
which dictates that he act as if he knew this on only one 
in every five trials on which the subject “hides” (which 
ones being determined randomly). Thus, 20% of the 
time the trusting trainer will “catch” the subject doing 
“wrong-hide” and will punish him for it; the subject 
will “get away with it” 80% of the time, however. When 
the subject is right and hides, he is rewarded 20% of the 
time because the trainer has detected his correctness 
and the other 80% of the time because he has succeeded 
in concealing his choice and the trusting trainer has 
given him the benefit of the doubt. 

Considering now the suspicious trainer we see that 
the subject can never “get away with anything” for this 
trainer always assumes the worst; unless he has evi- 
dence to the contrary, he will assume the subject de 
serves punishment. Even when the subject is correct, 
if he fails to show his choice he is rewarded only in those 
instances (20%) when the trainer detects his choice. 

Our experimental hypotheses may now be stated 
more precisely. First, there will develop a difference be- 
tween subjects under the two schedules in frequency of 
showing, those in the suspicious condition showing 
more. (Whether the subjects in the trusting condition 
will decline in showing or those in the suspicious condi- 
tion will increase will depend upon the initial level of 
showing created by the general “set”’ given the subjects 
by the experimental instructions.) Reference to the ma- 
trices describing the two conditions shows why this is so. 
The subject stands to profit under the trusting trainer 
simply by concealing what he does. For even if he re- 
sponds at random in terms of right and wrong, 90% of 
the time he’ll be rewarded if he hides, but only 50% if he 
shows. On the other hand, a subject working under the 
suspicious trainer will find that he gets rewarded only 
infrequently when he hides (10%) but substantially 
more often when he shows (50%). He should accord- 
ingly learn to show (or continue showing if this is his 
initial tendency). 

Second, subjects working with the trusting trainer 
will not learn to make the distinction between right and 
wrong as rapidly as will subjects who have the sus- 
picious trainer. Under either trainer, the subject ob- 
tains veridical information about how his choices cor- 
respond to those of the trainer only when he aids the 
trainer’s monitoring by showing. If he learns to show 
more under the suspicious trainer (per Hypothesis I), his 
subsequent information about his performance will be 
better and he will be better able to learn the trainers 
criterion. Because the subject is encouraged to hide by 
the trusting trainer, discrimination between right and 
wrong is made more difficult and, hence, learning 
retarded. This hypothesis depends upon Hypothesis I 
and assumes that the discrimination between showing 
and hiding one’s responses is learned more quickly than 
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the discrimination between what the trainer desires 
from the subject and what he does not. 

Third, if the first two hypotheses be correct, we 
should expect at the end of the training that subjects 

ting under the trusting trainer will have felt less 
responsible for (or control over) the kinds of outcomes 
they received as compared to subjects who have sus- 
picious trainers. These latter should ultimately find 
themselves in a more stable situation of converted fate 
control, with outcomes reliably dependent on what re- 
sponse they themselves make. 

It is unmistakable how closely intertwined these 
three hypotheses are. Each for its confirmation rests 
on the validity of the one(s) which preceded. It should 
also be noted that the two conditions are identical for 
subjects who always show their choices to the trainer. 
Such subjects do not experience the different schedules 
of the trainer because these schedules specify different 
actions only in the event of hiding. Hence, comparison 
of the two conditions is meaningful only for subjects 
who hide their choices on at least some of the trials. 

In the course of the experiment, the confederate 
trainer used two different criteria of what was right and 
wrong: For half the subjects he consistently chose the 
depression items and for the other half, the anxiety 
statements. Thus, there were altogether four experi- 
mental conditions which may be summarized as follows, 
giving first the trainer’s schedule and then his crite- 
tion: suspicious-anxiety (Sus-Anx),suspicious-depression 
Sus-Dep), trusting-anxiety (Tru-Anx), and trusting-de- 
pression (Tru-Dep). Of the 52 experimental subjects® 
12 showed their choices on all or all but one trial (9 with 
the depression criterion and 3 with the anxiety criterion) 
and were eliminated from the comparison of the condi- 
tions for the reason given above. The remaining 40 pro- 
vided 10 subjects in each of the four conditions. Subjects 
were assigned to experimental conditions according to 
a rotating schedule with some deviations tolerated in 
order to secure equal Ws. 


Control Group Data 


Control group data were needed at tie end of the 
experiment to determine the relative frequency with 
which the two items of each pair were selected when no 
reinforcing consequences were involved. Accordingly, 
booklets containing the pairs of statements, each pair 
printed on a separate page and in the same order as in 
the experiment, were distributed to male students in an 
introductory psychology class. Of 57 men, 41 were se- 
lected as being comparable to the subjects used in the 
experiment (in terms of age, class, etc.) and only their 
data were used in establishing norms. Each subject was 
simply asked to go through the pairs of statements, 
one by one, and decide which statement of each pair 
manifested the greater psychopathology, or psycho- 
logical illness. Each was instructed to try to evolve a 
reasonable, consistent, and justifiable criterion of 
psychopathology to facilitate his decisions. 








*Several subjects are not included in this total. They 
either suspected that the trainer was a confederate 
Six cases), failed to understand the instructions (one 
tase), or could not tolerate the shock (one case). 
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TABLE 1 


AVERAGE PERCENTAGE SHOWING IN SUCCESSIVE 
BLOcKs OF THE TRAINING 

















Trials 
Condition 

1-12 13-25 26-37 =. 
Sus-Dep (N = 10) 78 75 77 76 
Tru-Dep (N = 10) 62 50 42 51 
Sus-Anx (V = 10) 62 66 61 63 
Tru-Anx (WV = 10) 78 51 47 58 

RESULTS 

Showing and Hiding 


The results on frequency of showing are 
presented in Table 1. The tabled values are 
the averages of the percentages of trials on 
which subjects showed their choice to the 
trainer. Subjects in all conditions begin with 
a moderately high rate of showing, about 70% 
of the time. (It will be remembered that some 
subjects who showed on every trial have 
already been eliminated from the analysis.) 
As expected, there develops a sharp difference 
between the suspicious and trusting conditions. 
The subjects under the suspicious trainer 
maintain the initial level of showing. Those 
under the trusting trainer, although initially 
at about the same level, evince a steady decline. 
An analysis of variance which includes a 
trend analysis shows that there is a significant 
overall downward trend (p < .01) and a 
significant schedule-by-blocks interaction effect 
(p < .01). (Two-tailed tests of significance are 
used throughout this paper.) The latter indi- 
cates that the downward trend is steeper in 
the trusting condition than in the suspicious 
one. In terms of amount of showing over all 
trials, there is a significant schedule effect 
(p < .05). As can be seen in Table 1, the 
suspicious schedule yields more total showing 
than the trusting condition. The effect of the 
suspicious trainer, then, is to prevent a decline 
in subjects bringing choices to his attention, 
the total result being that he is shown more 
of their choices than is the trusting trainer. 

Table 1 also reveals that the Sus-Dep 
subjects have a rather high initial level of 
showing which they maintain. Consequently, 
their overall level is markedly higher than that 
of the Tru-Dep subjects (p < .01). In contrast, 
in the anxiety condition the Tru-Anx subjects 
have a high initial level and even though they 
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drop sharply, their overall level is almost as 
high as that of the Sus-Anx sample (the 
difference for all trials is not significant). 
Although the initial levels of showing are not 
significantly different as between the various 
conditions, it is possible that small differences 
in showing on the early trials have large 
effects on the learning, so it is necessary to 
take account of these initial showing rates in 
the analysis of the learning data, as described 
below. 


Learning the Trainer’s Criterion 


The data on learning the trainer’s criterion 
are presented in Table 2. The average per- 
centage correct for various portions of the 
training are shown for the four experimental 
groups and the control sample. The latter 
data indicate how often the depression items 
(or anxiety items) are chosen in the absence 
of any training. They afford a baseline from 
which learning in the experimental groups can 
be assessed. We observe that the groups in 
which the depression criterion was used lend 
support to the hypothesis: Subjects learn 
better under the suspicious trainer than 
under the trusting one (p < .05). However, 
with the anxiety criterion, the suspicious and 
trusting conditions are not differentiated. 

These differences in learning correspond to 
those in overall showing: A significant trusting- 
suspicious difference is obtained in the depres- 
sion condition where there was a large 
difference in overall level of showing and 
no such difference is obtained in the anxiety 
condition where there was only a marginal 
difference in total showing. And as already 
noted, the overall levels of pacer reflect, in 
part, differences in initial rates. This raises 
two problems in evaluating our hypothesis. 
On the one hand, the difference supporting the 
hypothesis in the depression condition may 
simply reflect the accidentally high initial 
level of showing in the suspicious sample. 
On the other hand, the lack of difference in 
the anxiety condition, which is contrary to 
the hypothesis, may reflect the accidentally 
high initial level of showing in the Tru-Anx 
sample. 

To evaluate our hypothesis in a manner 
that rules out possible effects of initial differ- 
ences in showing, a regression analysis was 
made in which amount of showing in the first 


TABLE 2 
AVERAGE PERCENTAGE CORRECT IN SUCCESSIVE 

















BLOCKS OF THE _ TRAINING 
Trials 
Condition —— 
1-12 |13-25 bs-s All 
| trials 
are = ———— =e —|- 
Sus-Dep (V = 10) 72 | 80 | 78 | 76%» 
Tru-Dep (VN = 10) 71 | 59 | 65 | 65s 
Control (Depression choices) | 67 63 | 66 | 65 
(N = 
Sus-Anx (V = 10) | 42 | 46 | 58 | 49 
Tru-Anx (N = 10) | 42 54 | 48 | | - 


Control (Anxiety choices) (NV | 33 | 37 | | 34 
1) 








® Significantly different from each other, p < .05. 
> Significantly different from the control group, p < .05. 


block of 12 trials was correlated with the 
subsequent amount of learning. As a measure 
of the latter, we used simply the number of 
items correct on the last two blocks (25 trials). 
This seemed warranted because the suspicious 
and trusting samples within each criterion 
condition are almost identical in their initial 
rates of correctness. This measure of learning 
was plotted against frequency of showing on 
the first block of trials. The correlations were 
obtained separately for the two criterion 
conditions‘ and the regression of subsequent 
learning upon initial showing was calculated 
for each case. Using these regression lines it 
was possible to correct each subject’s learning 
score for the amount of early showing he did. 

An analysis of variance of the corrected 
scores yields a significant schedule effect 
(p < .025) with the suspicious schedule 
yielding more correct responses on the last 
two blocks of trials than the trusting schedule. 
Although most of the schedule effect is due to 
the difference within the depression criterion 
condition, the interaction between schedule 
and criterion is not significant. 

Thus, even when initial rate of showing is 
taken into account, it appears that our 
hypothesis is supported, clearly so with the 
depression criterion and not so clearly so for 
the anxiety criterion. There is greater learning 
under the suspicious trainer than under the 
trusting one. The greater decline in showing 
under the latter’s schedule during the second 


‘ Initial showing was correlated with later perform- 
ance only in the anxiety criterion condition (r = .46, ? 
< .05). The correlation was not significant (r = — -0/ 


in the depression condition. 
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and third periods (as noted in Table 1) is 
accompanied by a lower level of adherence to 
his criterion. 

Incidentally, another basis for evaluating 
the learning in the Sus-Dep condition is 
provided by nine subjects trained under the 
depression criterion who showed their responses 
all the time. They made 73% correct responses 
on the first block of the training trials, 73% 
on the second block, and 81% on the last 
block, with an overall rate of 75% correct. 
While this group that received complete 
monitoring learned significantly more (p < 
05) than the control group (no training), it 
appears they did no better than the partially 
monitored group working under the suspicious 
trainer. 


Feelings of Responsibility and Conirol 


No differences in these respects were ob- 
tained for subjects whose trainers used the 
anxiety criterion, but some confirmation for 
our third hypothesis appeared in the results 
for the depression subjects, where, on the 
basis of previously mentioned data, we should 
expect to find support for it. Specifically, on 
the posttraining questionnaire, subjects in the 
Sus-Dep condition expressed significantly 
greater (p < .05) feelings of control in response 
to the question: “How much did you feel 
(the trainer’s) actions were dependent on 
what you did? That is, how much control did 
you feel your choices had on (the trainer’s) 
actions toward you?” On another question, 
“To what extent did you feel responsible for 
the kind of actions taken toward you?” 
the results were similar but failed to reach an 
adequate level of statistical significance 
(10 < p < .20). 

Several other items from the posttraining 
questionnaire give further insight into the 
different reactions to the two types of trainers. 
The following differences between the suspi- 
cious and trusting samples were significant 
at the .05 level: 

1. Subjects thought the trusting trainer was 
more generous than the suspicious one. 

2. Subjects working under the trusting 
trainer were more satisfied with their per- 
formance than were those subjects who had 
the suspicious trainer. 

3. Subjects thought the suspicious trainer 
was more concerned for them to adopt his 
criterion. 


4. The suspicious trainer was regarded as 
more critical than the trusting trainer (al- 
though this difference was significant only 
with the anxiety criterion). 

It is only fair to add that a number of 
other questions failed to reveal any consistent 
differences between the suspicious and trusting 
samples. These concerned how much the 
subject felt influenced by the trainer’s opinions, 
his reactions to being rewarded or punished, 
the perceived consistency of the trainer, and 
the extent to which the subject felt able to 
predict the trainer’s reactions. 


Differences between the Criteria 


In terms of both learning and subsequent 
feelings of control, the experimental hypotheses 
were less convincingly confirmed for the 
anxiety criterion than for the depression one. 
Two differences between the criteria are to be 
noted as possibly accounting for this result: 

1. The anxiety criterion is somewhat less 
plausible to our subjects. As can be seen in 
Table 2, the control group shows a 2:1 pref- 
erence for the depression items as indicators 
of psychopathology. Furthermore, of 10 sub- 
jects who at the end of the training indi- 
cated they had rejected the trainer’s criterion, 
7 were reacting to the anxiety criterion. 

2. The anxiety criterion seems somewhat 
easier for our subjects to verbalize. There are 
slightly more instances in the anxiety sample 
of correct descriptions of the trainer’s criterion 
and fewer instances of confusion between the 
two or describing the criterion by reference to 
the symptoms excluded from it. Furthermore, 
there is a hint in Table 2 that the anxiety 
criterion is learned more rapidly, although 
this difference may merely indicate a ceiling 
effect operating in the depression samples. 

These attributes of the anxiety criterion 
can reasonably be considered as acting to 
reduce the differences between the suspicious 
and trusting schedules. On the one hand, if 
the category is easy to conceptualize, subjects 
may learn it from the trusting trainer before 
they learn to hide their choices from him. 
This would make for good subsequent per- 
formance whether or not they eventually 
learn to hide. On the other hand, some of 
the subjects who quickly learn that the trainer 
is using an implausible criterion are likely to 
reject it and show poor subsequent perform- 
ance even though they know full well what his 
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criterion is. Insofar as learning is quicker under 
the suspicious schedule, these persons would 
sooner manifest their rejection of the criterion 
in this condition and, in terms of our present 
hypotheses, would be deviant cases. However, 
they would really be exemplifying the superior 
learning that the suspicious schedule makes 
possible. If we assume that a trainee has to 
have a fairly clear conception of the criterion 
before he can decide to accept or reject it, it 
would seem reasonable that the condition 
that leads him to show more will be superior 
not only in producing better learning of 
fairly acceptable trainer demands, but in 
hastening rejection of unacceptable ones. 

Several exceptional cases in the anxiety 
condition do, in fact, illustrate these effects, 
but they do not, of course, provide conclusive 
evidence as to the nature of the differential 
effects of the two criteria. 


DISCUSSION 


This experiment was designed to shed light 
on the situation where a trainee can act 
either to abet the trainer’s monitoring of his 
behavior or to defeat it. In everyday life 
continual surveillance by a trainer (parent, 
teacher, boss) of his trainee (child, student, 
employee) is usually not possible and probably 
not desirable. The trainee, then, is often faced 
with the same option that our subjects were 
confronted with: to help the trainer in his 
monitoring task or to make it difficult for him. 

In one condition, where a suspicious trainer 
was used, the subject Aad to show if he was 
to avoid the shock; here the trainer always 
assumed a mistake had been made unless the 
trainee himself made sure that the former 
actually saw his choice. Once the trainee had 
learned to show, he had a good chance of 
discerning the trainer’s criterion and, thus, of 
assuring himself of a certain invariance in 
outcomes. He could learn the “matching 
rule” that the trainer was using and the 
latter’s fate control could become behavior 
control. By virtue of bringing order into the 
situation the trainee could feel a kind of 
security and could regard himself—to some 
extent at least—as responsible for the kind 
of outcomes he experienced. 

Contrast this condition with the other that 
made use of a trusting trainer. Here there is 
no compulsion to help the monitoring system 
for the trainer is an “‘old softie,” an indulgent, 
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benevolent soul who thinks the best of his 
charges. Trainees are always innocent until 
proven guilty. Such a trainer retards the 
learning of his trainees by actually making it 
profitable for them to avoid “showing.” 
Since he will sometimes reward erroneous 
choices he will end up by confusing those he is 
trying to teach as to what exactly he is trying 
to teach. He may then give the impression of 
arbitrariness and capriciousness as well as of 
indulgence and lack of concern about the 
training. Accordingly, trainees may feel less 
control over what happens to them. 

It is to be emphasized that we do not 
regard the trusting schedule as universally 
inferior to the suspicious one. In this experi- 
ment we have tried to create the conditions 
that work especially to the disadvantage of 
the trusting trainer. This disadvantage exists 
when he is unable to maintain complete 
surveillance over the trainee and when the 
latter is able to aid the monitoring by bringing 
his behavior to the trainer’s attention. Within 
this situation, we have focused on the depend- 
ent variables of showing/hiding and learning 
and have only partially explored other conse- 
quences of the two schedules such as their 
effects on attitudes. Some of these effects are 
more positive for the trusting trainer: he is 
regarded as more generous and subjects are 
more satisfied with their performance. In any 
long-term relationship, “morale” considera- 
tions such as these may prove of greater 
consequence than the negative effects we have 
emphasized. Therefore, we would certainly not 
use these results as a basis for advocating a 
general introduction of suspicion into inter- 
personal relations. The negative consequences 
of distrust that other studies reveal (e.g., 
Mellinger, 1956) also suggest caution on this 
point. On the other hand, we would not like to 
see totally disregarded the possible role of 
suspicion in introducing stability into the 
trainee’s world—suspicion as defined in this 
study and under the special conditions here 
represented. 

A further limitation on generalization from 
this study stems from the fact that we have 
examined only two rather extreme reactions 
to the absence of monitoring information. It 
remains a further problem to determine the 
effects of a more neutral approach in which 
the trainer makes no response or a noncom- 
mittal one. Although the analysis becomes too 
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complex to go into here, we would expect this 
schedule to have an effect similar to that of 
the trusting trainer if the trainee obtains 
considerable intrinsic gratification from the 
wrong response and the trainer imposes an 
appropriately heavy penalty for this response 
when he detects it. 

This experiment deals with the behavior of 
the trainee and his reactions and feelings 
consequent on the programed responses of 
his trainer. It is an important question, 
obviously, how real trainers react in situations 
of this sort. One might expect, for example, 
that trainers would adopt a “trusting” 
orientation toward quick learners, but a 
“suspicious” orientation toward dull trainees. 
If this were done too soon in the learning 
process, however, the learning of the bright 
trainees might be delayed and the learning 
curves of the two groups might then show 
an unexpected convergence. The difficulty of 
the learning task might have similar effects. 
These are problems for future research. 


SUMMARY 


This study compared two schedules used by 
a trainer when he was not able to monitor 
the behavior of his trainee. In a suspicious 
condition, unless the trainee presented him 
with evidence to the contrary the trainer 
usually acted on the assumption the trainee 
was in error. In the other, a érusting condition, 
unless presented with contrary evidence the 
trainer usually assumed the trainee was 
correct. Each undergraduate subject, serving 
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as “trainee,” made successive judgments as to 
which of two types of symptoms indicated 
the greater degree of mental illness, whether 
an anxiety symptom or a symptom of de- 
pression. He also had the option of bringing 
each choice to the attention of the trainer, a 
“qualified” graduate student, who then 
gave an evaluation of the subject’s choice. 
For half the subjects, the trainer considered 
the anxiety choices as correct and for the 
other half, the depression choices. 

When the depression criterion was used the 
results were clearly in favor of the suspicious 
training schedule, as hypothesized. As com- 
pared to subjects working under the trusting 
trainer, subjects under the suspicious trainer 
were more willing to bring their choices to 
his attention, learned to a greater degree to 
use his criterion (subjects under the trusting 
trainer were not different in this respect from 
a control group), and expressed greater 
feelings of control over the trainer’s actions 
toward them. In the case of subjects to whom 
the trainer applied the anxiety criterion, the 
suspicious-trusting differences were less clear- 
cut although generally in the directions 
predicted. 
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ROLE PLAYING VARIATIONS AND THEIR INFORMATIONAL 
VALUE FOR PERSON PERCEPTION! 
EDWARD E. JONES, KEITH E. DAVIS, ano KENNETH J. GERGEN 


Duke University 


ARGELY under the impetus of Heider’s 
(1944, 1958) persistent concern with 
phenomenological analysis, much of 

the recent research in social perception has 
addressed itself to the naive psychology of 
the individual perceiver. How do individuals 
use the behavior of others to infer the probable 
existence of more enduring personal charac- 
teristics? What are the bases for social evalu- 
ation that in turn color the impressions one 
forms of another? What information is ignored 
and what information is made central in the 
formation of an impression? A number of 
investigators have sought a partial answer to 
these questions by assuming that a basic 
feature of naive phenomenology is the assign- 
ment of observed behavior to psychological 
causes. It seems logical to propose, for example, 
that behavior whose locus of causation lies 
within the person is more relevant to inferences 
about his particular characteristics than 
behavior that is induced or constrained by 
external events. The present investigation was 
designed to demonstrate this proposition with 
specific reference to the adoption and _ per- 
formance of social roles. 

The concept of role has had a lively and 
controversial history in the literature of social 
science. It is often treated as a crucial bridging 
concept since it concerns the relations between 
social requirements and normative expecta- 
tions on the one hand, and individual percep- 
tions and behavior on the other. Controversy 
has surrounded the many attempts to define 
role, as Levinson (1959) notes, because these 
attempts have vacillated between viewing 
role as an aspect of social structure and 
viewing it as a description of socially relevant 
individual behavior. In the present paper, 
the concept of role refers to role demands 
rather than actual behavior. Role is herein 
treated as a set of expected behaviors implicit 
in the instructions to a stimulus person. These 


1 This study was carried out under support from the 
National Science Foundation, G8857. We are much in- 
debted to Barbara Chapman who served as experi- 
menter throughout the study. 
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instructions define the impression the stimulus 
person should attempt to create in presenting 
himself to an interviewer, and variations jp 
behavior given this role definition represent 
the major independent variable. 

The present treatment of role is quite 
consistent with any other treatment. that 
stresses the shaping of individual responses by 


social expectations or externally imposed 


norms. The point has often been made that 
general adherence to relevant sets of social 
norms is very important in facilitating social 
interaction. Particularly in organizational 
contexts, but by no means exclusively there, 
many social interactions can be effectively 
described in terms of the interplay of appro- 
priate role behaviors. Jones and Thibaut 
(1958) have emphasized the economic signif- 
cance of such interactions between roles as 
reducing the need for inferences about idiosyn- 
cratic personal characteristics. The comple. 
ment of this point is that behavior appropriate 
to role expectations has little informational 
value in highlighting these individual charac- 
teristics. 

To follow this line of reasoning a little 
further, roles facilitate interaction and the 
social cognitions that support it. The naive 
person has his own repertory of role constructs 
that help to anchor his perceptions of the 
social environment and to endow it with the 
necessary stability for planful action. On the 
other hand, the performance of social roles 
tends to mask information about individual 
characteristics because the person reveals 
only that he is responsive to normative 
requirements. If these requirements are un- 
clear or conflicting, of course, he may re- 
veal something about himself by the way in 
which he defines and displays appropriate 
behavior. The stronger and more unequivocal 
the role demands, however, the less informa- 
tion is provided by behavior appropriate to 
the role. Following our introductory comments, 
this conclusion may be derived from con- 
sidering probable differences in the attribution 
of phenomenal causality. When a_ person's 
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behavior is very much in line with clear and 
potent social expectations, we tend to treat it 
as externally caused and uninformative with 
regard to a wide range of personal charac- 
teristics. When it departs from normative 
expectations, on the other hand, we tend to 
locate the cause for the departure in moti- 
yational forces peculiar to the person. We may 
assume, of course, that he misperceived the 
expectations, but we would then wish to push 
on to determine the motivational sources of 
this perceptual distortion. 

From the perceiver’s point of view, the 
behavior of a stimulus person which departs 
fom role expectations takes on_ special 
significance for appraising the latter’s personal 
characteristics. In assessing the motives behind 
such a departure from role, the perceiver must 
view the sample of behavior available against 
the background of role specifications. In 
general, our inferences from behavior to 
personality must take into account the 
stimulus conditions eliciting the behavior. This 
is no less true when the stimulus conditions 
consist of clearly established role expectations. 

In attempting to predict the nature and 
direction of inferences, given a sample of 
behavior that departs from role expectations, 
a number of factors must be considered. For 

vone thing, there may be tendencies for the 
perceiver either to minimize or maximize the 
nature of the departure. In organizing his 
impression of the stimulus person, the per- 
ceiver may assimilate the latter’s behavior 
sample to the role specifications governing 
the situation, thus, avoiding the problem of 
inferring unique characteristics. Alternatively, 
there may be a contrast effect in that the 
behavior sample becomes cognitively salient 
and is recalled as departing even more from 
the role than was actually the case. We are 
hot as yet in a position to choose between 
these alternative possibilities, or to specify the 
conditions favoring assimilation versus con- 
trast. The present study does provide a 
measure of memory distortion, however. 

Assuming that assimilation does not occur, 
or that it occurs incompletely, the perceiver’s 
inferences about unique characteristics rest 
on his attempt to understand why the depar- 
ture from role expectations took place. 


Undoubtedly, some departures are intended to 
achieve a humorous effect (the exchange 
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of “friendly insults” between collaborators 
on a task); others are intended to play down 
role characteristics that might be offensive 
(the “‘soft-selling’ salesman); still others 
stem from motives of rebellion and non- 
conformity. In the typical case, however, 
departure from role suggests a pattern of 
motivation and skill that is at variance with 
specific role requirements. The individual 
does not play the role because, somehow, he 
cannot or will not. In such cases, personality 
seems to override role expectations or to 
color role performance in a unique and signifi- 
cant way. The most probable inference from 
role departures of this type is that the person 
reveals something of his “true self” through 
his failure to perform the expected role. 

The present investigation treats this last 
conjecture as a proposition. An experiment 
was designed in which stimulus persons were 
instructed in one of two patterns of role 
performance. The behavior of the stimulus 
person was arranged to be consistent with 
either the first or the second of these patterns, 
thus, creating two experimental treatments 
where the person’s behavior was “in role” 
and two treatments where it was clearly 
“out of role.” The general hypotheses prompt- 
ing the study were: 

1. Persons performing in line with role 
expectations reveal little of value for assessing 
their personal characteristics. When asked to 
describe such a person, subjects do so with 
little confidence and tend to avoid extreme 
statements. 

2. Persons whose performance departs from 
role expectations reveal their personal charac- 
teristics through the direction and form of 
this departure. Their behavior is judged as 
internally caused and forms the basis for 
direct inferences about personal characteristics, 
characteristics that may be judged with 
confidence. 

3. Roles do, however, serve as an organizing 
function in person perception. Because of its 
predictive value, in-role behavior is more 
accurately recalled than behavior that departs 
from role expectations. 

Note that no specific hypothesis was 
formulated concerning the possibilities of 
assimilation and contrast, but data bearing 
on these possibilities are to be presented. 
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METHOD 
Subjects 


One hundred and thirty-four male undergraduates 
participated as subjects in groups ranging in size from 
5 to 20. Since the experimental design consisted of four 
treatment variations, an attempt was made to assign 
approximately equal numbers of subjects to each condi- 
tion. The actual cell frequencies varied from 31 to 37. 
As one attempt to control for individual differences in 
orientation to others, care was also taken to compose 
the treatment groups of approximately equal numbers 
of high, middle, and low scorers on Christie’s Mach IV 
Scale. This scale was originally designed to measure 
differences in the tendency to endorse Machiavellian 
sentiments. A high score reflects a tough minded, cyni- 
cal, and somewhat opportunistic attitude toward 
others; low scorers are more inclined to value affective 
involvement with others and to feel that social rela- 
tions should be governed by strict ethical norms. A 
description of item content and some sampling com- 
parison data may be found in Christie and Merton 
(1958). Studies by Jones and Daugherty (1959) and 
Jones, Gergen, and Davis (in press) provide data on the 
experimental validity of the scale. Though the present 
study was not designed to validate the Mach IV Scale, 
and the scale served mainly as a control variable, the 
Mach IV score was included as a potential source of 
variation in the analysis of data. It was felt that the 
highs and the lows might respond differently to the 
experimental conditions, though no specific hypotheses 
involving Machiavellianism were formulated. 


Procedure Overview 


Each experimental session began with a brief intro- 
duction in which the experimenter described the study 
as a problem solving task involving judgments of 
another person. Subjects were instructed to listen care- 
fully to a tape recorded interview between a psycholo- 
gist and a student, in which the student would be in- 
structed to play a particular role. The tape recording 
began with the psychologist giving explicit instructions 
to the stimulus person (SP) about the interview to 
follow. Although the recordings were actually based on 
carefully constructed scripts, an attempt was made to 
convince the subjects that the SP was given no in- 
formation about the interview and his role that did not 
appear on the tape. Thus, the taped instructions em- 
phasized that the SP was to present himself in the in- 
terview in such a way as to impress the interviewer 
that he was ideally suited for 2 particular job. In inter- 
views played to different groups of subjects one of two 
jobs requiring radically different personal qualifications 
was described. In this way the content of the role was 
manipulated. The SP was told in the recording to “be as 
honest as you can unless you think another answer 
would help your chances better of getting the job.” In 
the interview that followed, SP answered some standard 
questions about his background in a neutral fashion and 
then responded to a series of choice items some of which 
were clearly relevant to the job for which he was apply- 
ing. In responding to these items, the SP either gave 
answers appropriate to the job description he had been 
given, or answers which revealed markedly different 
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preferences. Thus, the design involved presentation of 
four different stimulus patterns, two of which were “jp. 
role” and two of which were “out-of-role.” 

After listening to the interview, each subject was 
asked to state his general impression of the SP and to 
fill out in succession the following dependent vari- 
able forms: given the same choice items to which the 
SP responded in the interview, subjects were asked to 
reconstruct from memory SP’s response to each item; 
after this choice test form was collected the subject was 
handed an identical form and instructed to indicate 
how the SP would have responded if he were being 
completely honest in describing himself; finally, each 
subject was given a 16-item trait-rating scale and in- 
structed to rate the SP, and indicate his subjective con- 
fidence for each rating. The stimulus materials and de- 
pendent variable forms are more fully described below. 


Stimulus Variations 


As implied by the foregoing discussion, the experi- 
mental design called for the construction of four sepa- 
rate tapes to be played as the stimulus pattern for in- 
dependent groups of subjects. These tapes varied along 
two cross-cutting dimensions: on half of the tapes the 
job described was that of a submariner, on the remain- 
ing half the job was that of an “‘astronaut”’ in training 
for space flights; on half of the tapes the job description 
was followed by a set of responses appropriate for the 
submariner job (other-directed pattern), on the re- 
maining half the responses were more appropriate for 
the astronaut job (inner-directed pattern). The four 
tapes were actually constructed by separately recording 
these four segments, always with the same person 
reading the part of the SP, and splicing them into the 
following combinations: Submariner-Other, Submari- 
ner-Inner, Astronaut-Inner, and Astronaut-Other. 

Role descriptions. Presentation of the submariner’s 
role was prefaced by a reference to the capacities of 
atomic submarines and the corresponding qualities 
necessary for adjusting to the social conditions of sub- 
marine life. The following excerpts give both the flavor 
and some of the content of the role description: 

People at the submarine school are pretty sure they 
know quite a bit about the kind of person who adapts 
well to submarine life. . .. The main thing they look 
for is stability and good citizenship . .. constant co- 
operation with others is essential . . . willingness to 
tolerate routine ... not supposed to think for him- 
self . . . sticks to the rules. . . . Since submariners are 
in such constant contact with each other, it’s im- 
portant, of course, that the good submariner enjoys 
other people around, that he be relaxed and friendly 
and slow to irritate. 

Presentation of the astronaut’s role capitalized on 
the timely issue of sending a single man into space. The 
role description is suggested again by the following ex- 
cerpts: 

One of the most difficult requirements of space travel, 
at least in its early stages, is that it will most likely 
involve a man’s being isolated from virtually all hu- 
man contact for long periods of time . . . looking for 
men who don’t need to have other people around . . . 
inner resources and the ability to maintain concen- 
tration without stimulation from others .. . alert, 
imaginative, resourceful. . . . 
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These particular role descriptions were designed, of 
course, without regard to truth value, solely to em- 
phasize plausibly two sets of qualities that might best be 
described as other- versus inner-directed. 

Behavior samples. SPs on each of the four tapes re- 
sponded in an identical fashion until they were asked 
by the interviewer to take a “choice test which has 
recently been devised to indicate how well a person will 
fit into various niches in life. . . .” The test that followed 
consisted of 22 items, each comprising a pair of state- 
ments. The SP was instructed to choose the member of 
each pair which was more characteristic of himself and 
to indicate his certainty on an 11-point scale. Half of 
the items were buffer items not specifically relevant to 
the difference between roles and always answered in the 
same manner by the SPs. The remaining 11 items were 
“critical” in reflecting the intended difference between 
inner- and other-directed response patterns. The fol- 
lowing examples indicate some of the pair members en- 
dorsed by the SPs in the two behavior samples: 


Inner-Directed 
I like to feel free to do 
what I want to do. 
I would like to be a 
forest ranger. 


Other-Directed 
3. I always like to sup- 
rt the majority. 
9. I would like to be a 
door-to-door sales- 


When planning some- 


man. 
17. When planning some- 
thing I like to 


thing I always seek 


suggestions from work on my own. 
others. 
21. I like to know how I avoid situations 


where I am ex- 
pected to behave 
In a conventional 
way. 

like to attack 
points of view that 
are contrary to my 
own. 


other people think 
I should behave. 


eS 


22. I like to settle argu- 
ments and disputes 
of others. 


For each of these item pairs, the SP orally endorsed 
the statement appropriate to the condition and indi- 
cated the degree of his certainty. Degree of certainty 
was predetermined so that the same scale positions were 
endorsed the same number of times for each behavior 
sample. This was to equalize any tendency to regress 
toward or away from less certainty in the memory task. 


Impression Rating Scale 


The final task of the subjects was to record their 
impression of the SP in terms of a 16-item rating scale. 
Each item consisted of two polar adjectives separated 
by 10 scale points. To the right of each item was a 5- 
point confidence scale for that item. Thus, subjects 
were to indicate what they thought the SP was “really 
like” and how confident they were of each rating. Ten of 
the items were chosen to reflect five distinct clusters 
whose contents were relevant to the hypotheses being 
tested. Each cluster consisted of two items, each sug- 
gesting an aspect of cluster meaning but balanced for 
direction to inhibit tendencies toward response set. 
Thus the conformity cluster consisted of the pairs 
“conforming-independent” and “creative-unoriginal.” 
Other clusters were affiliation, intelligence, motivation, 
and candor. The remaining six items were related to 
each other only in their strong evaluative tone: warm- 
cold, popular unpopular, likable-irritating, etc. 


TABLE 1 
PrepicTions OF SP’s True RESPONSES 
(Summary of Analysis of Variance* and / test Results) 











Source af| MS F |Group| N| X ‘ 
A. Astro-Sub 1 |1,554.73) 63.56°*| AO 33 | 91.12), s500 
B. Inner-Other 1 |1, 764.91) 64.27°*/SO 31 | 69.26)" °~ 
C. Mach 2 18.81 
AXB 1 8.63 AI 33 | 69 ISI ose* 
AxXc 2 62.67 SI 37 | 43.92/°° 
BXxXcC 2 39.23 
AXBXC 2 22.84 
Error 122 27.46 

Total 133 


























*Using the approximation technique for unequal cell 


frequencies (Snedecor, 1946). 
** p< .001. 


In analyzing the results, cluster scores were derived 
simply by adding the scale placements for the two com- 
ponent items, naturally reversing the score of one item. 
Confidence scores were similarly derived by simple 
addition of scale values. 


RESULTS 


Predicted Differences in Perception 


In line with the general hypotheses of the 
study, the first specific prediction was that 
SPs performing out-of-role (Astro-Others and 
Sub-Inners) would be perceived as revealing 
their true preferences in the simulated inter- 
view more than SPs _ performing in-role 
(Astro-Inners and Sub-Others). The most 
direct test of this hypothesis may be found 
in the data provided by the subjects in trying 
to indicate how the SP would have responded 
in the interview if he was being completely 
faithful to himself. These data are summarized 
in Figure la and in Table 1. It is clear that 
the prediction is confirmed. The Astro-Other 
and the Sub-Inner SPs are both seen as 
revealing their true preferences, though 
there is a slight and understandable regression 
toward the mean. Interestingly enough, and 
quite in line with predictions, the two in-role 
groups locate the true answers of the SPs 
almost exactly half-way between the other- 
directed and the inner-directed pattern. 
Since behavior appropriate to powerful role 
requirements generally masks the charac- 
teristics of the actor, the perceivers apparently 
feel that their best guess is a completely 
neutral one. 

The next issue to be raised is the extent to 
results are restricted to the 


which these 
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A. Predictions 


la. Predictions of “true answer patterns” (Means). 


IN ROLE 
B. Recall 


1b. Recall of answers actually given (Means). 





Fic. 1. Degree of perceived and recalled other directedness 


particular pattern of items covered in the 
recorded interview. That is, do the highly 
significant results summarized in Table 1 
merely reflect rational manipulations of the 
specific response scale used by the SP, or do 
they generalize to related measures of per- 
ception and inference? Recall that the subjects 
also filled out a bipolar adjective rating 
scale, in an attempt to express their appraisal 
of the SP’s true characteristics. The items 
of this scale were members of small clusters 
of related traits, varying in their relevance to 
the dimension of inner-other directedness. 
Two of the most relevant clusters, by a priori 
judgment, were attempts to measure percep- 
tions of affiliation and conformity. It was 
predicted that the two out-of-role SPs would 
be perceived to differ markedl:- in both 
affiliation and conformity and that the two 
in-role SPs would not. The results, as sum- 
marized in Table 2, clearly confirm this 
prediction. The Astro-Other is seen as signifi- 
cantly more affiliative and conforming than 
the Sub-Inner, and each is seen as differing 
significantly from its in-role control. Analyses 


of variance indicate that Machiavellianism 
contributes to no significant effects and there 
are no significant interactions between role 
and behavior sample. This result seems to 
indicate that genuine perceptual decisions 
have been made that involve assessing the 
meaning of the behavior sample provided by 
the SP against the background of the role 
playing instructions imposed on him. 

Since the predictions involving the direction 
and magnitude of perceptual rating differences 
have been borne out, the next relevant question 
concerns the confidence with which perceptual 
judgments were made. The general hypothesis, 
it will be recalled, predicted that in-role 
ratings would be less confidently made than 
ratings of out-of-role behavior. This prediction 
could be easily tested since the subject rendered 
a judgment of confidence with respect to each 
trait rated. The most precise test of the 
general prediction involves confidence ratings 
based on the two clusters most relevant to 
the differences between role: affiliation and 
conformity. When confidence ratings on these 
clusters are summed for each individual, the 
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TABLE 2 
PERCEPTIONS OF AFFILIATION AND CONFORMITY 


Comparisons 


Astro Astro- | Sub- | Sub- 
Other Inner |Other |Inner 
y 33 33 31 37 | 
Direction i 
Afhuation 
15.27 11.12 {12.00 | 8.64 |AO SO /[4.02**; 
| 
SD 2.92 3.81 | 3.53 | 4.73 |AI > SI /2.12° 
‘onformity | 
j 15.91 13.09 [12.58 | 9.41 |AO > SO /4.02°*; 
3.22 | 3.42 | 3.3914.95 |AI > SI |3.65** 


en 





Note.—The higher the mean value, the greater the perceived 
sfiliation or conformity. Comparisons between AO and SI are not 


tabled, but the differences between these conditions would of 
course be highly, significant 
*» < 05 
* >< 001 
TABLE 3 


CONFIDENCE RATINGS BY TREATMENTS 
(Analysis of Variance* Summary) 


Affiliation and Conformity 





Source dj — — 
MS F 

4. Astro-Sub 1 .07 
B. Inner-Other 1 1.62 3.29 
C. Mach 2 .26 
A4XB 1 8. 58> 17.42** 
AXC 2 .33 
BXC 2 .84 
AXBXC 2 47 
Within 122 .49 

Total 133 

*Using the approximation technique for unequal cell 


frequencies (Snedecor, 1946) 

> Results of tests of individual mean comparisons: AO > SO, 
i= 3.30, 9 < .01; Sl > AI, ¢ = 2.55, » < .02. 

* > < .001 


resulting pattern clearly confirms the hypoth- 
esis. As Table 3 shows, the interaction between 
role and behavior sample is highly significant. 
As for the individual mean comparison, in 
each case the subjects feel more confident 
about rating the SP who is behaving out of 
role than the SP who is behaving the same 
way in role. 


Differences in Recall 


As one of their tasks, all subjects were 
asked to reproduce the responses made by the 
SP in the simulated interview. The fidelity of 
these attempts at recall was treated in two 
ways. First, a recall score was computed for 
each subject involving the degree of discrep- 
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ancy in scale points for each item, summed 
without regard to direction. For convenience 
we may call these absolute error scores. 
Subjects were expected to make fewer errors 
in recalling the behavior samples of the in-role 
treatments than those in the out-of-role 
treatments. The assumption was that roles, 
like all categories that summarize relevant 
information, facilitate cognitive organization 
and enhance one’s ability to predict behavior 
that is appropriate to the role. In the present 
case, the behavior sample available in the 
in-role treatments tended to confirm the 
expectations established by the role instruc- 
tions. For subjects in the out-of-role treat- 
ments, the role instructions could not be used 
to organize and predict the behavior which 
occurred, except insofar as the subjects were 
led to adopt a clear negative expectation that 
the behavior was the opposite of that called 
for. 

The results on this measure of absolute 
recall confirm the hypothesis. The responses 
of the in-role SPs (Astro-Inner and Sub-Other) 
are recalled with greater accuracy than the 
responses of the out-of-role SPs (Astro-Other 
and Sub-Inner). As Table 4 shows, the pre- 
dicted interaction is significant (p < .05). Most ] 
of this interaction effect comes from the two 
cells in which the SP is other-directed (Sub> 
Others > Astro-Other, / = 2.60, p < .01; 
Astro-Inner > Sub-Inner, / = .64, p < .50). 
There is no obvious reason for this difference 
unless the submariner role was more helpfully 
predictive in organizing information about 
the other-directed behavior sample than was 
the astronaut role in organizing information 
about the inner-directed sample. 

The recall data were also scored to tans | 
account of the direction of deviation from { 
accuracy. These data are relevant for answer- 
ing any questions dealing with the assimilation 
of recalled responses to categories implied by 
the roles. In fact, as Figure 1b shows, there was 
no evidence of either assimilation or contrast 
in directional recall of the SP’s interview 
preferences. Those errors which the subjects 
did make (see the foregoing data on absolute 
errors) were quite evenly distributed on either 
side of the true scale position. As far as the 
mean directed error scores are concerned, then, 
the group recall accuracy was extremely high. 


| 
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TABLE 4 
ABSOLUTE RecALL ERRORS BY TREATMENTS 
(Analysis of Variance* Summary) 


Source dj MS PF 
A. Astro-Sub 1 3.89 
B. Inner-Sub 1 1.36 
C. Mach 2 .30 
A X B> 1 9.59 6.23* 
AXC 2 3.61 
-} a & 2 .13 
AXBXC 2 1.40 
Within 122 1.54 
133 


Total 





*Using the approximation technique for unequal cell 
frequencies (Snedecor, 1946). 

The means for the four cells represented by this interaction 
were: X (Astro-Other) = 12.42; X(Sub-Other) = 9.52; X (Astro. 
Inner) = 10.00; X (Sub-Inner)= 10.56. 

*~ < 05. 


Of course, we cannot state on the basis of 
these results that distortions from accurate 
recall of social behavior are always random 
distortions, but in the present case there is no 
evidence for directional errors either toward 
or away from the role category implied by the 
instructions to the SP. 


Perception of Additional Characteristics 

In planning the experiment, it seemed at 
least conceivable that many subjects would 
attribute the SP’s out-of-role performance in 
the Astro-Other and Sub-Inner conditions 
to poor motivation, lack of intelligence, or 
both. If such were the case, then it would 
not necessarily follow that the behavior 
sample provided would be taken as a “true” 
reflection of the SP’s affiliative and conforming 
tendencies (or the lack thereof). We have seen 
that most subjects did consider the out-of-role 
performance to be reflections of these ten- 
dencies, but it is still of interest to note their 
ratings of the SP on trait clusters tapping 
perceived motivation and intelligence. As 
Tabel 5 shows, the Sub-Other SP was seen as 
more highly motivated and intelligent than 
the Astro-Other SP (p < .01), but this 
expected trend was slightly reversed in the 
case of the two inner-directed SPs. It would 
appear, then, that the in-role SP is judged to 
have greater motivation and _ intelligence 
only when the role involves volunteering 
for the submarine service. There is no obvious 
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TABLE 5 
SELECTED TRAIT AND CLUSTER MEANS BY TREATMENTS 


WITH APPROPRIATE COMPARISONS 
Astro- | Sub- | Astro- | Sub- 
. Other (Other; Inner Inner Relevant 
Clusters. n= a= a= |= Comparisons 
33 33 31 37 
Motivation 
: 9.97 (12.10 9.64 (10.92 . 
SD 3.47 | 3.66| 4.48 | 4.09 | 52 > 40.9 < 01 
Intelligence 
¥ | 8.00 |11.13| 9.39 | 9.84] | 
SD 3.35 | 4.15| 3.72 | 3.94 | 99> AP <1 
Candor | | 
x | 12.42 | 9.68 | 10.09 }12.08 | AO > SO, » < m1: 
SI > Al, p < 0% 


SD 4.09 


3.61 3.68 | 3.77 


reason for this difference in response to the 
two roles. Perhaps the inner-directed pattern 
seemed more artificial or obvious in the context 
of astronaut instructions, or perhaps the 
meaning of motivation and _ intelligence 
became less situation-bound when subjects 
were asked to appraise a truly inner-directed 
man. 

Also of interest are the ratings of perceived 
candor. It might be expected that performing 
out-of-role would be construed as evidence of 
the SP’s frankness and sincerity. The results 
in Table 5 do show a clear difference between 
in-role and out-of-role SPs with regard to the 
perception of candor. As expected, the in-role 
SPs are judged by the average subject to fall 
at or near the midpoint of the scale (10.0 
for the cluster); the out-of-role SPs are seen 
as significantly more candid in each variation 
of role instructions. 

Results on the remaining traits add little 
to the picture already presented. For the 
most part the means fall into a pattern similar 
to those reflecting perceived affiliation and 
conformity. That is, the two in-role SPs are 
perceived to be relatively neutral on most of 
the evaluative trait dimensions; the Astro- 


Others are seen as significantly more likable 
(versus irritating), warm (versus cold), 
popular (versus unpopular), and helpful 


(versus disinterested) than the Sub-Inners. 
This pattern of findings might suggest the 
operation of a strong “halo” or “generosity” 
effect favoring those who are perceived as 
other directed. However, it must be recalled 
that the Astro-Other SP is seen as less highly 
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motivated and less intelligent than the 
Syb-Other SP, personal attributes that are 
ysually considered to be quite evaluative. 
Also, Astro-Other and Sub-Inner SPs are 
both seen aS more candid than their in-role 
controls. It seems more likely, then, that 
evaluative traits like warm, popular, likable, 
and helpful are linked to other-directedness 
more by direct association than through an 
underlying decision that the Astro-Other 
SP is good and the Sub-Inner SP is bad. This 
conclusion is supported by two further results: 
to an extent that is nearly significant, the 
two out-of-role SPs are seen as more interesting 
(versus boring) than the two in-role SPs; 
also, the Astro-Inner SP is seen as significantly 
more conceited (versus self-effacing) than 
either the Astro-Other or the Sub-Inner SP. 
While these incidental findings are not all 
easy to rationalize, they do point up the com- 
plexity and subtlety of the cognitive impres- 
sions created by the experimental combinations 
of role and behavior sample. 


Machiavellianism 


While the study was designed without any 
consideration of the possible role of Machiavel- 
lianism as measured by Christie’s Mach IV 
Scale, it did seem possible that high scorers on 
the Mach Scale might be generally more 
sensitive to variations in role instructions and 
more inclined toward negative evaluation of 
the SPs who were unable or unwilling to play 
the prescribed role. 

In all of the major analyses, Mach level was 
included routinely as a potential source of 
variance. In no case did it contribute to 
significant main or interactive effects. Either 
the situational manipulations were simply too 
powerful for individual differences on this 
dimension to manifest themselves, or our 
knowledge of Mach Scale correlates is insuffi- 
cient to make meaningful predictive state- 
ments. In view of the results of a recently 
completed study by Jones, Gergen, and Davis 
(in press), the latter alternative seems quite 
tenable. With respect to one meaningful 
comparison, however, Machiavellianism does 
make a significant contribution. The high 
scorers tend to attribute greater intelligence 
to the in-role than the out-of-role SPs whereas 
the low scorers show the reverse pattern in 
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perceiving intelligence. This difference between 
highs and lows is significant (¢ = 2.15, p < 
.05). A similar trend was noted with regard 
to perceived motivation, but this fell short of 
significance (¢ = 1.62, p < .15). Though the 
differences in perceived intelligence are 
consistent with what little knowledge we do 
have about high and low Mach scorers, there 
is little else in the data to suggest that the 
variable is relevant to the perception of in-role 
and out-of-role SPs. 


DISCUSSION 


The results of the present study are un- 
equivocal. Starting from the assumption that 
individual characteristics are obscured when a 
person is exposed to strong and demanding 
stimulus forces, we have reasoned that social 
stimuli embraced by the role concept may 
operate in this same way. Thus a person who 
conforms to salient social expectations reveals 
little about his basic and distinguishing char- 
acteristics. On the other hand, one who rejects 
or ignores pressures to play a defined role is 
considered to reflect his true disposition and 
is perceived with confidence. 

It is undoubtedly true that this reasoning 
is ubiquitous in the psychologist’s approach 
to personality assessment. In many programs 
of assessment, the patient or subject is exposed 
to a variety of situational pressures and task 
demands. Of his responses to these situations, 
his nonmodal reactions are clearly more 
informative and carry the most interpretive 
weight. To take another example, psychological 
screening for desirable jobs must cope with 
the problems raised by this study in order to 
be effective. Since the role constraints for the 
applicant are often obvious, the interviewer or 
employer must penetrate to more subtle 
cues or fall back on projective devices which, 
though unreliable at best, at least produce the 
response variety essential for individualized 
judgment. 

It is probably true that short-term social 
interactions can perfectly well proceed in line 
with established expectations defining re- 
ciprocal roles. If the interaction is self-limiting 
(as, say, between a hotel guest and his bellhop) 
there is little need for personalized information 
to sustain the interaction. When a relationship 
is more permanent, however, and involves 
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the prospect of interactions in varied situ- 
ations, such information rapidly increases in 
value. It is to the perceiver’s advantage, in 
such cases, to be especially attuned to out-of- 
role behaviors and to create situations in 
which out-of-role behaviors can be most 
clearly observed. The results of the present 
study show that such behaviors, rightly or 
wrongly, are perceived to be_ peculiarly 
diagnostic of individual characteristics. Judg- 
ments about these personal qualities are 
presumably most important in governing 
the perceiver’s behavior as he ventures into 
new situations with the SP. 


SUMMARY 


An experiment was designed to test the 
general proposition that behavior which is 
appropriate to a clearly specified role is 
relatively uninformative about personal char- 
acteristics. Subjects were asked to listen to a 
recorded interview in which the interviewee 
was heard being instructed, in two treatment 
conditions, to respond “‘as if” he very much 
desired to be accepted in the submarine 
service, and in two treatments as if he wished 
to qualify as a space astronaut. The qualifica- 
tions of these two positions were described 
in such a way that dramatically different 
personal qualifications were required. As the 
interview proceeded, the interviewee either 
responded in line with the qualifications 
described for the astronaut position (inner- 
directedness) or with those for the submariner 
position (other-directedness). Thus, in a 
four-cell design there were two cells of in-role 
behavior to be judged (Astronaut-Inner and 
Submariner-Other) and two cells of out-of-role 
behavior (Astronaut-Other and Submariner- 
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Inner). Reasoning from the importance of 
distinguishing between self-caused and exter. 
nally induced behavior in person perception, 
the following predictions were made: 

1. Out-of-role SPs are perceived to be 
revealing their true characteristics more than 
in-role SPs. 

2. Out-of-role SPs are rated with greater 
confidence on the dimensions relevant to 
role performance. 

3. The performance of in-role SPs is more 
accurately recalled than that of out-of-role 
SPs. 

Each of these predictions was strongly 
confirmed by the results. 


REFERENCES 


Curistre, R., & Merton, R. K. Procedures for the 
sociological study of the values climate of medical 
schools. J. med. Educ., 1958, 38, 125-153. 

HEIveER, F. Social perception and phenomenal causality, 
Psychol. Rev., 1944, 61, 358-374. 

‘Hewer, F. The psychology of interpersonal relations. 
New York: Wiley, 1958. 

Jones, E. E., & DauGuerty, B. N. Political orienta- 
tion and the perceptual effects of an anticipated 
interaction. J. abnorm. soc. Psychol., 1959, 889, 
340-349. 

Jones, E. E., GerGen, K. J., & Davis, K. E. Some de- 
terminants of reactions to being approved or dis- 
approved as a person. Psychol. Monogr., in press. 

Jones, E. E., & Tursavt, J. W. Interaction goals as 
bases of inference in interpersonal perception. In 
R. Tagiuri & L. Petrullo (Eds.), Person perception 
and interpersonal behavior. Stanford: Stanford 
Univer. Press, 1958. Pp. 151-179. 

Levinson, D. J. Role, personality, and social structure 
in the organizational setting. J. abnorm. soc. 
Psychol., 1959, 68, 170-180. 

SnEDEcoR, G. W. Statistical methods. (4th ed.) Ames, 
Iowa: Collegiate Press, 1946. 


(Received September 23, 1960) 











uo 


pre 
the 


e of 
Xter- 
tion, 


» be 
than 


ater 
t to 


nore 
Tole 


agiy 


the 


lical 
lity. 
ons. 


nta- 
ited 











Journal of Abnormal and Social Psychology 
wel, Vol. 63, No. 2, 311-318 


IDENTIFICATION AS A PROCESS OF INCIDENTAL LEARNING! 


ALBERT BANDURA anp ALETHA C. 


Stanford University 


LTHOUGH part of a child’s socialization 
takes place through direct training, 
much of a child’s behavior repertoire 

is believed to be acquired through identifica- 
tion with the important adults in his life. This 
process, variously described. in behavior 
theory as “vicarious” learning (Logah, Olm- 
sted, Rosner, Schwartz, & Stevens, 1955), 
observational learning (Maccoby & Wilson, 
1957; Warden, Fjeld, & Koch, 1940), ahd 
role taking (Maccoby, 1959; Sears, Maccoby, 
& Levin, 1957) appears to be more a result 
of active imitation by the child of attitudes 
and patterns of behavior that the parents 
have never directly attempted to teach than 
of direct reward and punishment of instru- 
mental responses, 

While elaborate developmental theories 
have been proposed to explajn this phenome- 
non, the process subsumed under the term 
“identification” may be accounted for in 
terms of incidental learning, that is, learning 
that apparently takes place in the absence of 
an induced set or intent to learn the specific 
behaviors or activities in question (McGeoch 
& Irion, 1952). 

During the parents’ social training of a child, 
the range of cues employed by a child is likely 
to include both those that the parents consider 
immediately relevant and other cues of 
parental behavior which the child has had 
ample opportunities to observe and to learn 
even though he has not been instructed to do 
so. Thus, for example, when a parent punishes 
a child physically for having aggressed toward 
peers, the intended outcome of the training is 
that the child should refrain from hitting 
others, Concurrent with the intentional learn- 
ing, however, a certain amount of incidental 
learning may be expected to occur through 
imitation, since the child is provided, in the 
form of the parent’s behavior, with an ex- 
ample of how to aggress toward others, and 


‘This investigation was supported in part by Re- 
search Grant M-1734 from the National Institute of 
Health, United States Public Health Service, and the 
Lewis S. Haas Child Development Research Fund, 
Stanford University. 
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this incidental learning may guide the child’s 
behavior in later social interactions. 

The use of incidental cues by both human 
and animal subjects while performing nonimi- 
tative learning tasks is well documented by re- 
search (Easterbrook, 1959). In addition, 
studies of imitation and learning of incidental 
cues by Church (1957) and Wilson (1958) have 
demonstrated that subjects learn certain inci- 
dental environmental cues while imitating 
the discrimination behavior of a model andy, hon 
that the inci al learning guides the subs. \: 
jects’ discrumi ta SENCE fat r 
of the model. The purpose of the experiment 4, an fed 
reported in this paper is to demonstrate that s“**, 4, 
subjects imitate not only discrimination te, et 
sponses but also other behaviors performed by f, a 
the model. aye ls 

The incidental learning paradigm was em-u* vie 
ployed in the present study with an important @ “4, 
change in procedure in order to create a situa- v” ae 
tion similar to that encountered in learning 
through identification. Subjects performed an 
orienting task but, unlike most incidental 
learning studies, the experimenter performed 
the diverting task as well and the extent to 
which the subjects patterned their behavior 
after that of the experimenter-model was 
measured. 

The main hypothesis tested is that nursery 
school children, while learning a two-choice 
discrimination problem, also learn to imitate 
certain of the experimenter’s behaviors which 
are totally irrelevant to the successful per- 
formance of the orienting task. 

One may expect, on the basis of theories of 
identification (Bronfenbrenner, 1960), that 
the presence of affection and nurturance in the 
adult-child interaction promotes incidental 
imitative learning, a view to which empirical 
studies of the correlates of strong and weak 
identification lend some indirect support, 

Boys whose fathers are highly rewarding arid 
affectionate have been found to adopt the 
father-role in doll play activities (Sears, 1953), 
to show father-son similarity in response to 
items on a personality questionnaire (Payne & 
Mussen, 1956), and to display masculine be- 
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haviors (Mussen & Distler, 1959, 1960) to a __ tion trials, and the extent to which the subjects re. 


greater extent than boys whose fathers are produced the model's behavior was measured. The 
experimental and control procedures differed only in the 


relatively cold and unrewarding. ; ; patterns of behavior displayed by the model. 
One interpretation of the relationship be-) 
tween nurturance and identification is that | Matching Variable 


affectional rewards increase the secondary re-/ {Dependency was selected as a matching variable 
inforcing properties of the model and, thus, / since, on the basis of the theories of identification, de- 
predispose the imitator to reproduce the be-) pendency would be expected to facilitate imitative 


havior of the model for the satisfaction these learning. There is some evidence, for example, that de- 
pendent subjects are strongly oriented toward gaining 


cues provide (Mowrer, 1950). Once the paren- social rewards in the form of attention and approval 
tal characteristics have acquired such reward (Cairns, 1959; Endsley & Hartup, 1960), and one 
value for the child, conditional withdrawal of means of obtaining these rewards is to imitate the 
positive reinforcers is believed to create addi- behavior of others (Sears, Maccoby, & Levin, 1957). 


tional instigation for the child to perform be- Moreover, such children do not have the habit of 
esponding independently; consequently they are apt 


haviors resembling that of the parent model. to be more dependent on, and therefore more attentive 
i.e., if the child can reproduce the parent’¢ | to, the cues produced by the behavior of others 
rewarding behavior he can, thus, reward him4¥ (Jackubczak & Walters, 1959; Kagan & Mussen, 1956). 


self (Sears, 1957; Whiting & Child, 1953). In _Measures of subjects’ dependency behavior were ob- 
=~ tained through observations of their social interactions 


line with this theory of identification = terms in the nursery school. The observers recorded subjects’ 
of secondary reward, it is predicted that chil- behavior using a combined time-sampling and behavior- 
dren who experience a warm, rewarding inter- unit observation method. Each child was observed for 
action with the experimenter-model should re- 12 10-minute observation sessions distributed over a 


e significantly more of the behaviors P¢tiod of approximately 10’ weeks; each observation 
produce Sig tly ft behavior session was divided into 30-second intervals, thus, 


performed by the model than do children who yielding a total of 240 behavior units 
experience a relatively distant and cold rela- The children were observed in a predetermined order 
tionship. that was varied randomly to insure that each child 
would be seen under approximately comparable condi- 
METHOD tions. In order to provide an estimate of reliability of 
: the ratings, 234 observation sessions (4,680 behavior 
Subjects units) were recorded simultaneously but independently 
by both observers. 

The subjects’ emotional dependency was assessed in 
terms of the frequency of behaviors that were aimed at 
securing a nurturant response from others. The fol- 
lowing four specific categories of dependency behavior 
were scored: seeking help and assistance, seeking praise 
and approval, seeking physical contact, and seeking 
General Procedure proximity and company of others. 

The dependency scores were obtained by summing 

Forty subjects were matched individually on the the observations made of these five different types of 
basis of sex and ratings of dependency behavior, and _ behaviors and, on the basis of these scores, the subjects 
subdivided randomly in terms of a nurturant-non- were paired and assigned at random to the two experi- 
nurturant condition yielding two experimental groups mental conditions. 
of 20 subjects each. A small control group comprising 
8 subjects was also studied. Experimental Conditions 

In the first phase of the experiment half the experi- 
mental and control subjects experienced two nurturant 
rewarding play sessions with the model while the re- 
maining subjects experienced a cold nonnurturant re- 
lationship. For the second phase of the experiment sub- 
jests performed a diverting two-choice discrimination 
problem with the model who exhibited fairly explicit, 
although functionless, behavior during the discrimina- 


The subjects were 24 boys and 24 girls enrolled in 
the Stanford University Nursery School. They ranged 
in age from 45 to 61 months, with a mean age of 53 
months. The junior author played the role of the model 
for all 48 children, and two other female experimenters 
shared in the task of conducting the study.* 


In the nonnurturant condition, the model brought the 
subject to the experimental room and after instructing 
the child to play with the toys that were spread on the 
floor, busied herself with paper work at a desk in the 
far corner of the room. During this period the model 
avoided any interaction with the child. 

In contrast, during the nurturant sessions the model 
sat on the floor close to the subject. She responded 

? The authors wish to express their appreciation to _ readily to the child’s bids for help and attention, and in 
Alice Beach and Mary Lou Funkhouser for their assist- other ways fostered a consistently warm, and rewarding 
ance in collecting the data, and to Ruth Barclay and __ interaction. 

Claire Korn for their help with the behavior ubserva- These experimental social interactions, which pre- 
tions. ceded the imitation learning, consisted of two 15-minute 
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sessions separated by an interval of approximately 5 
days. 


Diverting Task 

A two-choice discrimination problem, similar to the 
one employed by Miller and Doliard (1941) in experi- 
ments of matching behavior, was used as the diverting 
task which occupied the subjects’ attention while at the 
same time permitting opportunities for the subjects to 
observe behavior performed by the model in the ab- 
sence of any instructions to observe or to reproduce the 
responses resembling that of the model. 

The apparatus consisted of two small boxes, identical 
incolor (red sides, yellow lid) and size (6” X 8” X 10”). 
The hinged lid of each box was lined with rubber strip- 
ping so as to eliminate any auditory cues during the 
placement of the rewards which consisted of small 
multicolor pictures of animals and flowers. The boxes 
were placed on small chairs approximately 5 feet apart 
and 8 feet from the starting point. 

At the end of the second social interaction session 
the experimenter entered the room with the test ap- 
paratus and instructed the model and the subject that 
they were going to play a game in which the experi- 
menter would hide a picture sticker in one of the boxes 
and that the object of the game was to guess which box 
contained the sticker. 

The model and the subject then left the room and 
after the experimenter placed two stickers in the desig- 
nated box, they were recalled to the starting point in 
the experimental room and the model was asked to take 
the first turn. During the model’s trial, the subject re- 
mained at the starting point where he could observe 
the model’s behavior. 

Although initially it was planned to follow the pro- 
cedure used by Miller and Dollard (1941) in which one 
of two boxes was loaded with two rewards and the 
child made his choice immediately following the leader’s 
trial, this procedure had to be modified when it became 
evident during pretesting that approximately 40% of 
the subjects invariably chose the opposite box from the 
model even though the nonimitative response was con- 
sistently unrewarded. McDavid (1959), in a recent 
study of imitative behavior in preschool children, en- 
countered similar difficulties in that 44% of his sub- 
jects did not learn to imitate the leader even though the 
subjects were not informed as to whether the leader was 
or was not rewarded. 

In order to overcome this stereotyped nonimitation, 
the experimenter placed two rewards in a single box, but 
following the model’s trial the model and the subject 
left the room and were recalled almost immediately 
(the intratrial interval was approximately 5 seconds), 
thus, creating the impression that the boxes were re- 
loaded. After the subject completed his trial, the model 
and the subject left the room. The experimenter re- 
corded the subject’s behavior and reloaded the boxes 
for the second trial. The noncorrection method was 
used throughout. This procedure was continued until 
the subject met the learning criterion of four successive 
imitative discrimination responses, or until 30 acquisi- 
tion trials had been completed. The slight modification 
in procedure proved to be effective as evidenced by the 


fact that only 9 of the 48 children failed to meet the 
criterion. 

In order to eliminate any position habit, the right- 
left placements of the reward were varied from trial 
to trial in a fixed irregular order. This sequence was 
randomly determined except for the limitation that no 
more than two successive rewards could occur in the 
same position. 

The number of trials to criterion was the measure of 
the subjects’ imitation behavior on the discrimination 
task. 

Although the establishment of imitative choice re- 
sponses was, in itself, of some theoretical interest, the 
discrimination problem was intended primarily as an 
orienting or distraction task. Thus, on each discrimina- 
tion trial, the model exhibited certain verbal, motor, and 
aggressive behaviors which were totally irrelevant to 
the performance of the task to which the subject’s at- 
tention was directed. At the starting point, for example, 
the model remarked, “Here I go,” and then marched 
slowly toward the box containing the stickers repeating, 
“March, march, march.” On the lid of each box was a 
small rubber doll which the model knocked off aggres- 
sively when she reached the designated box. She then 
paused briefly, remarked, “Open the box,” removed one 
sticker and pasted it on a pastoral scene that hung on 
the wall immediately behind the boxes. The model 
terminated the trial by replacing the doll on the lid of 
the container. The model and the subject then left the 
room briefly. After being recalled to the experimental 
room the subject took his turn, and the number of the 
model’s behaviors reproduced by the subject was re- 
corded. 


Control Group 

In addition to the two experimental groups, a con- 
trol group, consisting of eight subjects, comparable to 
the experimental groups in terms of sex distribution, 
dependency ratings, and nurturant-nonnurturant ex- 
periences was studied. Since the model performed highly 
novel patterns of responses unlikely to occur inde- 
pendently of the observation of the behavior of the 
model, it was decided to assign most of the available 
subjects to the experimental groups and only a small 
number of subjects to the control group. 

The reasons for the inclusion of a control group were 
twofold. On the one hand, it provided a check on 
whether the subjects’ behavior reflected genuine imita- 
tive learning or merely the chance occurrence of be- 
haviors high in the subjects’ response hierarchies. 
Second, it was of interest to determine whether the 
subjects would adopt certain aspects of the model’s 
behavior that involved considerable delay in reward. 
With the controls, therefore, the model walked to the 
box, choosing a highly circuitous route along the sides 
of the experimental room; instead of aggressing toward 
the doll, the model lifted it gently off the container and 
she left the doll on the floor at the completion of a trial. 
While walking to the boxes the model repeated, “Walk, 
walk, walk.” 


Imitation Scores 
On each trial the subjects’ performances were scored 
in terms of the following imitation response categories: 
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selects box chosen by the model; marches; repeats the 
phrases, “‘Here I go,” “March, march,” “Open box,” or 
“Walk, walk”; aggresses toward the doll; replaces doll on 
box; imitates the circuitous route to the box. 

Some subjects made a verbal response in the appro- 
priate context (for example, at the starting point, on the 
way to the box, before raising the lid of the container) 
but did not repeat the model’s exact words. These 
verbal responses were also scored and interpreted as 
partially imitative behavior. 

In order to provide an estimate of the reliability of 
the experimenter’s scoring the performances of 19 
subjects were scored independently by two judges who 
alternated in observing the experimental sessions 
through a one-way mirror from an adjoining observa- 
tion room. 

RESULTS 
Reliability of Observations of Dependency Be- 
havior 

The reliability of the observers’ behavior 
ratings was estimated by means of an index of 
agreement based on the ratio of twice the 
number of agreements over the combined 
ratings of the two observers multiplied by 100. 
Since small time discrepancies, due to in- 
evitable slight asynchronism of the observers’ 
timing devices, were expected, a time dis- 
crepancy in rating a given behavior category 
greater than two 30-second intervals was in- 
terpreted as a disagreement. 

The interobserver reliabilities for the de- 
pendency categories considered separately 
were as follows: Positive attention seeking, 
84%; help seeking, 72%; seeking physical 
contact, 84%; and seeking proximity, 75%. 
Reliability of Imitation Scores 

The percentage of agreement in scoring 
imitative behavior in the experimental sessions 
is presented in Table 1. Except for other imita- 
tive responses, the subjects’ behavior was scored 
with high reliability and, even in the letter re- 


TABLE 1 

ScORER RELIABILITY OF IMITATIVE RESPONSES 

Percentage 

Response Category Agreement 
Aggression 98 
Marching 73 
Imitative verbal behavior 80 
Partially imitative verbal behavior 83 
Other imitative responses 50 
Replaces doll 99 
Circuitous route 96 


TABLE 2 
AMOUNT OF IMITATIVE BEHAVIOR DISPLAYED py 
SUBJECTS IN THE EXPERIMENTAL AND 
CONTROL GROUPS 





Experimental Control 
Subjects Subjects 
| N = 40 N =8 
Response Category a ey 
centage —y B.d sy 
| Imi- | Trial Imi- Trl 
tating | tating = 
, . i ie - 
Behaviors of experi- 
mental model 
Marching 45 | ae i 8 0 
Verbal responses 28 10 | O 0 
Aggression 90 .64 13 01 
Other imitative re-| 18 .03 | O 0 
sponses 
Partially imitative 43 | .11 | O 0 
verbal behavior | | 
Replacing doll 90 | .60 | 75 | .77 
Behaviors of control 
bs model 
Circuitous route 0 0 75 .58 
Verbal responses 0 0 13 | .10 





Note.—The mean number of trials for subjects in the experi- 
mental group (13.52) and in the control group (15.25) did not 
differ significantly. 


sponse category, the scoring discrepancies 
arose primarily from the experimenter’s lack 
of opportunity to observe some of the be- 
haviors in question rather than from differences 
of interpretation, for example when the subject 
made appropriate mouth movements but 
emitted no sound while marching toward the 
containers, this partial imitation of the model's 
verbalizations could not be readily observed 
by the experimenter (who was at the starting 
point) but was clearly evident to the rater in 
the observation room. 


Incidental Imtiation of Model’s Behavior 

Since the data disclosed no significant sex 
differences, the imitation scores for the male 
and female subgroups were combined in the 
statistical analyses. 

Ninety percent of the subjects in the ex- 
perimental groups adopted the model’s ag- 
gressive behavior, 45% imitated the marching, 
and 28% reproduced the model’s verbaliza- 
tions. In contrast, none of the control subjects 
behaved aggressively,’ marched or verbalized, 
while 75% of the controls and none of the ex- 
perimental subjects imitated the circuitous 

3 One subject in the control group hit the doll off the 
box on one trial only. 
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route to the containers. Except for replacing 
the doll on the box, which was performed by 
most of the experimental and control subjects, 
there was no overlap in the imitative behavior 
displayed by the two groups (see Table 2). 

While the control subjects replaced the doll 
on the box slightly more often than the subjects 
in the experimental group, this difference 
tested by means of the median test was not 
statistically significant (x? = 1.49; df = 1). 
Evidently the response of replacing things, 
undoubtedly overtrained by parents, is so well 
established that it occurs independently of the 
behavior of the model. Since this was clearly a 
nonimitative response, it was not included in 
the subsequent analyses. 

To the extent that behavior of the sort 
evoked in this study may be considered an 
edementary prototype of identification, the 
results presented in Table 2 add support to 
the interpretation of identification as a process 
of incidental imitative learning. 


Effects of Nurturance ou Imitation 

“tr-order to make comparable the imitation 
scores for the subjects who varied somewhat in 
the number of trials to criterion, the total 
imitative responses in a given response cate- 
gory were divided by the number of trials. 
Since only a small number of subjects in the 
nonnurturant condition displayed imitative 
nonaggressive behavior and the distributions of 
scores were markedly skewed, the sign test 
was used to estimate the significance of differ- 
ences between the two experimental groups. 
VThe predicted facilitating effect of social 
rewards on imitation was essentially confirmed 
(see Table 3). Subjects who experienced the 
rewarding interaction with the modei marched 
and verbalized imitatively, and reproduced 
ther responses resembling that of the model 
to a greater extent than did the subjects who 
experienced the relatively cold and distant re- 
lationship. Aggression, interestingly, was 
readily imitated by subjects regardless of the 
quality of the model-child relationship. 


‘mulation of Discrimination Responses 

A three-way analysis of variance (McNemar, 
oz ‘, rt, - . . 
1955, Case XVID) of the trials scores failed to 
show any significant effects of nurturance or 
sex Of imitator on the imitation of discrimina- 
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TABLE 3 
SIGNIFICANCE OF DIFFERENCES IN IMITATIVE BEHAVIOR 
EXHIBITED BY SUBJECTS IN THE NURTURANT AND 
NONNURTURANT EXPERIMENTAL CONDITIONS 








| 
Number of Sub- 
jects Imitating 


Response Category Silica Non- | ? 
| turan | 
V = | turant | 
\ (N = 
20) 20) 
Nonaggressive behaviors 15 7 04 
Marching | 13 | 5S | .05 
Verbal behavior | 9 2 |; «.0S 
Other imitative responses 6 1 | 06 
Aggressive behavior | 20 16 ns 
Partially imitative verbal re- | 12 >| 


sponses 

Note.—The two groups of subjects did not differ in the mean 
number of trials to criterion. The means for subjects in the nurtu- 
rant and nonnurturant conditions were 13.75 and 13.30, respec- 


tively. 


TABLE 4 
ANALYSIS OF VARIANCE OF SuByEcTs TRIALS SCORES 
ON THE DISCRIMINATION LEARNING TASK 


| 


Vari- | 
Seurce of Variance | d/ Fstic F p 
mate 
Sex 1 | 294 4.20} .10> p> .05 
Nurturance 1 + <1 | ns 
Sex X 1/158 | 1.45 | ns 


Nurturance 
Matched pairs 14} 70 
Remainder | 14, 109 


Note.—One subject who refused to continue the task before he 
reached the learning criterion and three subjects who could be run 
for only 20 trials had to be excluded from this analysis. The results, 
therefore, are based on 32 matched pairs. 


tion responses (see Table 4), nor did the two 
groups or experimental subjects differ signifi- 
cantly in the number of trials in which they 
imitated the model’s choice or in the number of 
trials to the first imitative discrimination re- 
sponse. 

While nurturance did not seem to influence 
the actual choices the subjects madeV it never- 
theless affected their predecision behavior. A 
number of the children displayed considerable 
conflictful vacillation, often running back and 
forth between the boxes, prior to making their 
choice. In the analysis of these data, the vacil- 
lation scores were divided by the total number 
of trials, and the significance of the differences 
was estimated by means of the sign test since 








316 


the distribution of scores was markedly 
skewed. The results of this test revealed that 
the subjects in the nurturant condition ex- 
hibited more conflictful behavior than subjects 
in the nonnurturant group (p = .03). This 
finding is particularly noteworthy considering 
that one has to counteract a strong nonimita- 
tion bias in getting preschool children to follow 
a leader in a two-choice discrimination problem 
as evidenced by McDavid’s (1959) findings as 
well as those of the present study (i.e., 75% of 
the subjects made nonimitative choices on the 
first trial). 


Dependency and Imitation 


Correlations between the ratings of depend- 
ency behavior and the measures of imitation 
were calculated separately for the nur- 
turant and nonnurturant experimental sub- 
groups, and where the correlation coefficients 
did not differ significantly the data were com- 
bined. The expected positive relationship be- 
tween dependency and imitation was only 
partially supported. High dependent subjects 
expressed more partially imitative verbal be- 
havior (7; = .60; p < .05) and exhibited more 
predecision conflict on the discrimination task 
(r = .26; p = .05) than did subjects who were 
rated low on dependency. 

Dependency and total imitation of nonag- 
gressive responses was positively related for 
boys (r, = .31) but negatively correlated for 
girls (r, = — .46). These correlations, however, 
are not statistically significant. Nor was there 
any significant relationship between depend- 
ency and imitation of aggression (r = .20) or 
discrimination responses (rf = — .03). 


DISCUSSION 


The results of this study generally substanti- 
ate the hypotheses that children display a good 
deal of social learning of an incidental imitative 
sort, and that nurturance is one condition 
facilitating such imitative learning, 

The extent to which the modél’s behavior 
had come to influence and contro] the behavior 
of subjects is well illustrated by their march- 
ing, and by their choice of the circuitous route 
to the containers. Evidence from the pretest- 
ing and from the subjects’ behavior during the 
early discrimination trials revealed that dash- 
ing toward the boxes was the dominant re- 
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sponse, and that the delay produced by march- 
ing or by taking an indirect route that more 
than doubled the distance to the boxes was 
clearly incompatible with the subjects’ eager- 
ness to get to the containers. Nevertheless, 
many subjects dutifully followed the example 
set by the model. 

Even more striking was the subjects’ imita- 
tion of responses performed unwittingly by the 
model. On one trial with a control subject, for 
example, the model began to replace the dol] 
on the box at the completion of the trial when 
suddenly, startled by the realization of the 
mistake, she quickly replaced the doll on the 
floor. Sure enough, on the next trial, the sub- 
ject took the circuitous route, removed the 
doll gently off the box and, after disposing of 
the sticker, raised the doll, and then quickly 
replaced it on the floor reproducing the model’s 
startled reaction as well! Wow! Sevesher. 

The results for the influence of nurturance on 
imitation of verbal behavior are in accord with 
Mower’s (1950) autism theory of word learn- 
ing. Moreover, the obtained significant effect of 
nurturance on the production of partially 
imitative verbal responses indicates that nur- 
turance not only facilitates imitation of the 
specific behaviors displayed by a model but 
also increases the probability of responses of a 
whole response class (for example, verbal be- 
havior).) These data are essentially in agree- 
ment with those of Milner (1951), who found 
that mothers of children receiving high reading 
readiness scores were more verbal and affec- 
tionately demonstrative in the interactions 
with their children than were the mothers of 
subjects in the low reading ability group. 

That the incidental cues of the model’s be- 
havior may have taken on positive valence and 
were consequently reproduced by subjects for 
the mere satisfaction of performing them, is 
suggested by the fact that children in the nur- 
turant condition not only marched to the con- 
tainers but also marched in and out of the 
experimental room and marched about in the 
anteroom repeating, “March, march, march,” 
etc., while waiting for the next trial. While 
certain personality patterns may be, thus, in- 
cidentally acquired, the stability and per- 
sistence of these behaviors in the absence of 
direct rewards by external agents remains to 
be studied. 
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A response cannot be readily imitated unless of the model. A more crucial test of the trans- 
its components are within the subjects be- mission of behavior through the process of 
havior repertoire. The fact that gross motor social imitation involves the generalization of 
responses are usually more highly developed imitative responses to new situations in which 
than verbal skills in young children may ex- the model is absent. A study of this type, in- 
plain why subjects reproduced the model’s volving the delayed imitation of both male and 
marching (p = .05) and aggression (p < .001) female aggressive models, is currently under 
toa significantly greater extent than they did way. 
her verbal behavior. Indeed several subjects 
imitated the motor component of speech by 
performing the appropriate mouth movements The present study was primarily designed 
but emitted no sound. The greater saliency of to test the hypotheses that children would 
the model’s motor responses might also be a jearn to imitate behavior exhibited by an ex- 
possible explanation of the obtained differ- perimenter-model, and that a nurturant in- 
nces. . ake err 
“Identification with the aggressor” (Freud, a ——— pes aall planing 
1997) or “defensive identification” (Mowrer, ties of the model and thus facilitate such 
1950), whereby a child presumably transforms _jmitative learning. , 
himself from object to agent of aggression by Forty-eight preschool children performed a 
adopting the attributes of an aggressive, puni- diverting two-choice discrimination problem 
tive model so as to allay anxiety, 1s widely with a model who displayed fairly explicit, al- 
accepted as an explanation of the imitative though functionless, behaviors during the 
learning of aggression. The results of the trials. With the experimental subjects the 
present study, and those of a second experi- model marched, emitted specific verbal re- 
ment now in progress, suggest that the mere sponses, and aggressed toward dolls located on 
observation of aggressive models, regardless of the discrimination boxes; with the controls 
the quality of the model-child relationship, 1S@ the model walked to the boxes choosing a 
suficient condition for producing imitative highly circuitous route and behaved in a non- 
ee dation of ae pets ree aggressive fashion. Half the subjects in the 
s . : experimental and control groups experienced 
are feared, who are liked and esteemed, or who q rewarding interaction with the model prior 
are more Or less neutral figures would throw to the imitative learning while the remaining 
some get ay i not a oe = bh subjects experienced a cold and nonnurturant 
monious theory than the one involved in relationship. 
ae ck , ms ; 
juin the modeling prooaa, Je) a. : or as. os ogo gene.’ wee poo one 

Although the results from the oss study) ‘ ARE ab fine ——— = 
ieee tnt surat pomonsrof™ msn ned Davin renin 
sa ie in crmminng epee egy, dnt ren he 

7 ; of imitative responses they displayed. 
be expected, according to the secondary ret 2. The predicted facilitating effect of social 
forcement theory of imitation, to furnish 2 er Cape me nae" 
: rewards on imitation was also confirmed, the 

stronger incentive than nurturance alone for alin aiiaiatitia el sells all i 
subjects to reproduce a model’s behavior. It ee ee yn 8 ¥ dope ae 
is also possible that dependency may be essen- readily imitated by the subjects regardless of 
tially unrelated to imitation under conditions the quality of the model-child relationship. : 
of consistent nurturance, but may emerge as a 3. Although nurturance was not found to in- 
variable facilitating imitation under conditions fluence the rate of imitative discrimination 
where social reinforcers are temporarily with- earning, subjects in the nurturant condition 


SUMMARY 


drawn. exhibited significantly more predecision con- 
The experiment reported in this paper flict behavior than did subjects in the non- 
focused on immediate imitation in the presence nurturant group. 
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OF HOSTILITY’ 


DONALD J. VELDMAN anp PHILIP WORCHEL 
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NCE aroused, the hostile impulse may 
be expressed in diverse ways. It may 
lead to overt expression in verbal or 
physical attack, may be expressed only in 
fantasy, or completely denied expression. 
Aggression may be displaced toward targets 
apparently unconnected with the arousal of 
the impulse. In the search for the determinants 
of the reactions to the hostile impulse, in- 
vestigators have turned to the socialization 
process and to personality theories. Child 
rearing practices which are predicted to 
produce specific kinds of hostile responses 
have been investigated (Bornston & Coleman, 
1956; Child, 1954; Sears, Maccoby, & Levin, 
1957). Hypotheses relating ego control (Block 
& Martin, 1955; Livson & Mussen, 1957), 
self-ideal (SI) discrepancy (Rothaus & Wor- 
chel, 1960; Worchel, 1958), and anxiety 
(Doris & Sarason, 1955) to aggression have 
been tested. In general, these investigations 
have yielded significant but !ow correlations 
and have been limited by the circumscribed 
and ambiguous nature of the response meas- 
ures employed. t 
Hostility has been used to refer to drive, 
emotion, attitude, and overt response, and 
studies often fail to specify what component 
of hostility is being measured. In addition, 
hostility arousal and reduction have usually 
been assessed immediately after the inter- 
polation of some experimental variable; the 
factor of time itself has been neglected. 
Common observations suggest that time may 
be an important consideration affecting hostile 
feelings and attitudes differentially. People 
“blowup” suddenly, and then anger seems to 
subside. Attitudes of dislike and rejection may 
continue unchanged (Newcomb, 1947). It is 
the purpose of the present study to subject 
these observations to experimental verification 
and to extend knowledge of the role of per- 
_' This research was supported in part by the United 
States Air Force under Contract No. AF 49(638)-460 
monitored by the AF Office of Scientific Research of 
the Air Research and Development Command. 
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sonality in the management of hostile feelings 
and attitudes. Four types of personalities 
based upon seif-concept theory (Rogers, 
1951) are described. Hypotheses are then 
derived relating defensiveness and_self- 
acceptance to the management of hostility. 

Self-theory contends that the structure of 
the self is formed as a result of direct experi- 
ence with the environment as well as evalu- 
ational interactions with others. Since hostility 
is a major product of the inevitable frustra- 
tions of living, the attitudes and controls 
imposed by socializing agents upon the ex- 
pression of hostility play an influential role in 
the development of the self-structure. For 
some individuals, all forms of hostile behavior 
have been met with severe counteraggression. 
To maintain an accepting self-picture, these 
persons, as Rogers (1951) describes, are 
compelled to deny any hostile experience. 
Their self- and ideal-concepts remain con- 
gruent, and a facade of self-acceptance is 
presented but at the expense of the continual 
repression of hostility. Block and Thomas 
(1955) and Hatfield (1958) have shown, for 
example, that persons with low SI discrepan- 
cies tend to use repressive defenses. Also, 
Altrocchi, Parsons, and Dickoff (1960) have 
demonstrated that repressors manifest smaller 
SI discrepancies than sensitizers. 

The healthy or adjusted personality “exists 
when the concept of the self is such that all the 
sensory and visceral experiences of the or- 
ganism are, or may be, assimilated on a 
symbolic level into a consistent relationship 
with the concept of self” (Rogers, 1951, p. 513). 
Hostile feelings are accepted and the resultant 
behavior would “probably be at times social 
and at other times aggressive” (p. 502). 
Here, also, the SI picture is one of congruence 
but without the operation of defensive pat- 
terns. 

There are persons, however, whose early 
development has led to rejecting self-pictures. 
Their attempts to attain internalized ideals 
have met with parental criticism and punish- 
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ment. Their self- and ideal-concepts are 
highly incongruent, and they present a picture 
of anxiety, tension, and depreciation. Author- 
ity figures are perceived as threatening and 
they are unable to retaliate for fear of punish- 
ment. Hostility can be displaced to “scape- 
goats”’ who are unable to retaliate. As a result 
of stimulus-generalization, aggression-anxiety 
experienced (Miller, 1948), leading to 
further depreciation and tension. These 
subjects may be described as nondefensive 
with high SI discrepancy. Worchel (1958), 
for example, has shown that subjects with 
high SI discrepancy express significantly 
fewer direct aggressive responses towards an 
authority figure than those with low SI 
discrepancy. In a later study, Rothaus and 
Worchel (1960), using a questionnaire describ- 
ing arbitrary and nonarbitrary frustrations, 
found that such subjects gave more hostile 
actions toward a_ hypothetical frustrating 
agent than those with low SI discrepancy. In 
this case, however, there was no danger of 
retaliation. Worchel has just completed a 
study in which, following the experimental 
induction of frustration by a professor, sub- 
jects with high SI discrepancy expressed 
stronger hostility towards “innocent” targets 
than those with low discrepancy. 

Persons with high SI discrepancy can 
defend themselves against anxiety and tension 
by rationalizing and distorting their failure to 
attain the idealized goals. They can set up a 
facade of “doing as well as others.”” They 
perceive their behavior as above reproach 
as far as the social group is concerned. Aggres- 
sion displaced towards innocent scapegoats 
does not arouse aggression-anxiety in this 
case since justification is found for such 
attack. They distort the situation thus: “‘it is 
not my father (or other authority figure) who 
makes me angry, but it is who is 
irritating.” 

To summarize, four types of personalities 
are derived from Self-Concept theory: low 
defensive, low SI discrepancy (adjustive); 
high defensive, low SI discrepancy (repressive) ; 
low defensive, high SI discrepancy (anxious); 
and high defensive, high SI discrepancy 
(distorters). 

If self-acceptance and defensiveness interact 
to influence the management of hostility as 


is 
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described above, then the following predictions 
are implied: 

1. Subjects with low defensive, low SJ 
discrepancy (adjustive) will express the 
strongest feelings of anger and hostile atti. 
tudes, while high defensive, low SI discrep. 
ancy subjects (repressive) will express the 
least anger and hostile attitudes of all the 
groups. 

2. Subjects with low defensive, high SI 
discrepancy (anxious) will indicate the most 
aggression-anxiety, while low defensive, low 
SI discrepancy subjects (adjustive) will 
indicate the least aggression anxiety of all 
the groups. 

3. Subjects with high SI discrepancy will 
displace more hostility than subjects with low 
SI discrepancy. 

4. Feelings of anger will tend to decrease 
during a delay period following hostility 
arousal, while hostile attitudes will tend to 
persist undiminished. 


METHOD 
Subjects 


Eighty undergraduate males were drawn from an 
introductory psychology course on the basis of the 
Worchel (1958) Self-Activity Inventory (SAI) and the 
K scale items from the MMPI (Meehl & Hathaway, 
1956). SAI served as a measure of SI discrepancy, and 
the K scale as an index of defensiveness. The subjects 
were dichotomized with respect to each of these tests, 
using cutting points of 45.5 on the SAI and the 145 
on the K scale. Four groups of 20 subjects were thus 
obtained: low SI discrepancy and low defensiveness 
(LSI-LKA), high SI discrepancy and low defensiveness 
(HSI-LK), low SI discrepancy and high defensiveness 
(LSI-H&), and high SI discrepancy and high defensive- 
ness (HSI-H&). 

Ten subjects in each of these four groups were 
assigned at random to one of two experimental condi- 
tions. In the nondelay (ND) condition the hostility 
measures were administered immediately after hostility 
arousal. In the delay (D) condition hostility measure- 
ment was delayed 20 minutes. Three replications of the 
design were completed. The subjects were distributed 
as evenly among the sessions as their schedules per- 
mitted; no session included less than 10 or more than 
16 subjects. Subjects in each session were drawn from 
different psychology sections to eliminate social 
structuring present in the original classes. 


Procedure 


The initial phase of the experiment was the same 
for all sessions. The examiner told the subjects his 
name, that he was a professor, that the test results 
would go on their permanent records, and that the 
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«ores would be used for vocational guidance, scholar- 
ship recommendations, and admission to graduate 
The University of Texas Intelligence Test was then 
administered. Each of its 10 subtests was given with 
| time limits of 30-90 seconds. The examiner used a 
stop watch to time the tests and made frequent dis- 
| tracting remarks to the subjects about their slowness, 
| the simplicity of the questions, and the penalty for 
\ cuessing wrongly. They were told that most people 
| should finish within the allotted time. The time limits 
were, in fact, too short for any subject to complete the 
subtests. Thus failure, distraction, and insult con- 
stituted the frustrating situation, Spontaneous 
comments of the subject during and after the experi- 
ment indicated considerable frustration and annoyance 
asa result of this treatment. 

The examiner left the room “to score the tests” 

’ assoon as the “intelligence test’”’ was completed. At no 
time was the examiner aware of the experimental 
treatment (delay) for any particular group he tested. 
The examiner’s assistant distributed the next set of 
answer sheets as quickly as possible to prevent any 
verbal interaction among subjects. 

In the second phase of each session, the ND groups 
took the posttreatment tests immediately. The D 
groups completed a 20-minute word association task 
before taking the final battery of tests. The interpolated 
task consisted of writing free association responses to 
the Kent-Rosanoff list of 100 words presented orally 
by the assistant. Since this list contains words chosen 
for their lack of emotional content, it was assumed that 
this task would be relatively neutral in its cathartic, 
guilt arousing, or hostility provoking aspects. At the 
end of each session the subjects were told the nature of 
the experiment and were asked not to discuss it until 
all sessions had been completed. 


Dependent Variables 


The last set of instruments given in all sessions 
} formed a booklet of four tests? from which six dependent 
variables were derived. 

Direct Anger. This instrument served as an index of 
the way the subject was willing or able to characterize 
his own emotional state at the time of testing. A list of 
nine adjectives, each followed by a five-point rating 
scale, was presented with the following instructions: 

Intelligence testing produces various feelings in 
those being tested. This questionnaire does not have 
any right or wrong answers; you are asked only to 

+ report your own feelings as accurately as possible. 

Place a check mark after each adjective so as to 

describe how you feel at the present time. 





* The four tests used in the present study have been 
deposited with the American Documentation Institute. 
Order Document No. 6865 from ADI Auxiliary 


Publications Project, Photoduplication Service, Library 

} of Congress; Washington 25, D. C., remitting in 
advance $1.25 for microfilm or $1.25 for photocopies. 
Make checks payable to: Chief, Photoduplication 
Service, Library of Congress. 
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The scores for the adjectives angry, irked, and 
annoyed were summed for the Direct Anger score. 

Projected Anger. Since the self-report measure of 
anger might be influenced by inhibitory mechanisms, 
an identical version of the adjective list was included, 
but with the following instructions: 

The ability to get along with people is related to 

the ability to understand how most people react to a 

situation, even if one’s own reactions are somewhat 

different. Here you are asked to fill out the test the 
way you think most people would feel now after an 
intelligence test similar to the one you just took. 

A comparable Projected Anger score was derived 
from the responses to the same three adjectives used for 
Direct Anger. Internal consistency estimates of the 
reliability of the score for Direct Anger and Projected 
Anger, based on the average item-intercorrelations, 
were .78 and .89, respectively. 

Hostile Altitudes. A test of Insight and Social 
Sensitivity included 10 items asking for an appraisal of 
the examiner. Each statement was followed by a six- 
point rating scale, and the subject was asked to indicate 
the extent of his agreement or disagreement with each 
statement. Two components of hostile attitudes were 
assessed separately: rational and irrational aggression. 
Rational aggression (Examiner Blame) was defined as 
“blaming the examiner” for the subject’s poor per- 
formance since, in reality, he did distract, interrupt, 
and upset the subject. Irrational aggression (Examiner 
Devaluation) consisted of statements condemning the 
examiner on characteristics which had little or nothing 
to do with the frustrating situation. Examiner Blame 
score was the sum of the agreement scores for five 
statements which blamed the examiner for errors made 
by the subject on the “intelligence test” (e.g., “The 
examiner was to blame for some of the errors I made”’). 
The Examiner Devaluation score was similarly derived 
from five statements concerning various aspects of the 
examiner’s personality (e.g., ““The examiner appears to 
be a dependable person’’). Internal consistency esti- 
mates of the reliability of these scores were .80 and 
.87, respectively. 

Displaced Hostility and Aggression-Anxiety. Ten 
stems from a sentence completion test developed by 
Zimmer (1959) were selected to measure these aspects 
of the hostile reaction. The responses were scored 
accordingly to Zimmer’s manual. A more detailed 
description of the method and instruments used is 
available elsewhere (Veldman, 1960). 


RESULTS AND DISCUSSION 


Since the six measures were obtained from 
the same subject, interactions may occur to 
confound the results. It is also possible that, 
regardless of the apparent face validity of the 
measures the tests are all really tapping the 
same variable. Intercorrelations between vari- 
ables, however, are all less than .37 with the 
exception of that between Direct and Pro- 
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TABLE 1 
MEANS oF ALL SUBGROUPS ON MEASURES OF HOSTILE FEELINGS AND ATTITUDES FOR THE Detay 
(D) AND NonDELAY (ND) ConpiTIoNs 
LSI-LK HSI-LK LSI-HK HSI-HK 
bere —* 
ND D ND D ND D ND D 
Direct Anger 9.4 7.9 7.3 2 6.8 5.3 8.0 7 2 
Projected Anger 9.0 7.7 7.9 7.9 8.¢ 6.4 9.6 6.6 
Examiner Blame 25.8 19.6 21.9 21.5 19.8 20.3 22.9 19.1 
Examiner Devaluation 15.7 14.2 12.5 14.9 13.9 12.4 13.3 14.7 
Displaced Hostility 27 _? 3.8 2.9 2.4 2.9 3.2 3.3 
Aggression-Anxiety 2.4 2.4 a.4 2.8 2.6 3.9 3.4 3.1 
jected Anger, which as would be expected, is _ TABLE 2 
somewhat higher (.59).3 ANALYSIS OF \ ARIAN( E OF HOsTILE FEELINGS 
pH ait bas . : (Anger) AND AtTtTitupEs (Examiner Blame 
The means of each of the eight experimental FoR ALL Groups 
groups are presented in Table 1. Analyses of 
variance (SI X K X D) were carried out for Direct Projected | Examiner 
¢ . . Source of nger manger ‘ame 
each of the six dependent variables and the pcos af 
results are presented in Tables 2 and 3. us| F |Ms| Fis! s 
Hostile Feelings. As predicted, the inter- 
action of K X SI for Direct Anger (see Table Delay (D 1 15.31 | 1.86 |52.81 | 6.63°/122.51 
. ° . = , > Defensive (K 1 27.61 3.35 2.11 27 | $6.11) 1.72 
) ‘ > {ve ‘ ( | 
2) is significant at the .05 level (/ 4.93). cir ideal (S] : ai ai «ol «lo 
Whereas the mean scores (regardless of delay) Dx k 1 | 1.01 12 19.01 | 2.30 | 13.61! 4 
for the two high SI groups (low and high PS! 1 | 7.81) 95] 31! .06) 2.81) 
° . . ear e . 4 - K XSI 1 (40.61 4.93") 5.51 69 | 19.01 § 
defensiveness) are practically identical (7.4 pyx xs! 1 |a.0:| .22/ 5.521 .e0 lezr.sals 
and 7.7, respectively), the means of the low Within 72 | 8.25 7.97 32.57 
SI groups are radically different. The greatest mee 
- * Significant at the .05 level 


expression of anger is produced by the adjusted 
(8.7) and the least anger is expressed by the 
repressive subjects (6.4). Defensiveness, as 
measured by the K scale, is an important 
determinant of the direct expression of anger, 
particularly with subjects of low SI discrep- 
ancy. Taken alone, defensiveness is almost 
significant (7 3.35) at the .05 level (F of 
3.98 required for .05 level). 

Delay does not seem to be a significant 
variable in the direct expression of anger. In 
the projective expression of anger, however, 
delay is signific ant (see Table 2). Subjects in 
the D groups projected less anger than those 
in the ND groups. 

Defensive subjects would be expected to 
reveal more anger in the projective than in the 


the intercorrelations of all 
six variables has. been deposited with the American 
Ifstitute. Order Document No. 6865 
Publications Project, Photo 


Washington 


*A 1-page table giving 


Documentation 
from ADI Auxiliary 
duplication Service, Library of Congress; 


25, D. C., remitting in advance $1.25 for microfilm or 
$1.25 for photocopies. Make checks payabie to: Chief, 
Photoduplication Service, Library of Congress 


direct test. That this is partly true is shown by 
comparing the differences between the two 
techniques for the high and low defensive 
groups regardless of SI discrepancy. The mean 
difference of 1.7 for the high defensive groups 
under the ND condition is significant at the 
.02 level (i 
nique, these subjects expressed greater anger 
than in the direct version. For the low defen- 
sive subjects, under both D and ND cond: 
tions, there is no significant difference between 
the two versions of the test. 

Hostile Attitudes. It was predicted that the 
adjusted group would express strongest hos- 
tility while the repressive group would express 


least hostility toward the examiner. On 
rational aggression (Examiner Blame), the 
means in Table 1 were in the predicted direc: | 
tion (25.8 and 19.8, respectively) in the ND 
condition. The F ratios, however, show that 
these results are influenced by delay. The 
main effect of delay is almost significant at the 
.05 level and the triple interaction, D X K X 


= 2.58). In the projective tech- 





dels 
tent 
adj 
0 
whi 
agg! 
for 


Tht 


int 
host 


D 











Ow OAs) 


NGS 


aminer 
lame 


F 


mn by | 

two 
nsive 
mean 
roups 
t the 
tech- 
unger 
efen- 
ondi- 
ween 


t the 
hos- 
press 
On 
the 
lirec- 
ND 
that 
The 
t the 
K X 


DEFENSIVENESS, SELF-ACCEPTANCE, AND HOSTILITY 


rABLE 3 
AyaLYsIS OF VARIANCE OF DisPLAcED HOsTILITY 
AND AGGRESSION ANXIETY FOR 
ALL Groups 





Displaced Aggression- 
| Hostility Anxiety 
Source of variation | d/ : 
| MS F MS I 
| Delay (D) 1) .o1; .o1; .o1| .O1 
Deiensive (K) | 1 31 22 | 3.61 | 3.37 
Self-ideal (ST) | 1 6.61 4.74* 3.61 3.37 
DXK 51 2.04 | 3.58 | €.58 | 4.2" 
Dx SI 1 | 2.81 | 2.01 | 7.81 | 7.30** 
KX SI 1 01 .01 | 3.61 | 3.37 
DX KX SI | 1 .61 44 61 .57 
Within 72 | 1.40 1.07 
* Significant at the .05 level 
** Significant at the .01 level 


SI, barely reaches this level of significance. 
Examiner Blame is greater immediately after 
the frustrating test than 20 minutes later. 
The adjusted group which has the highest 
Examiner Blame score in the ND condition 
25.8) shows the greatest drop in hostility 
after delay (6.2). The lowest Examiner Blame 
sore was given by the repressive group 

19.8), and with delay, the mean Examiner 
Blame score actually increased slightly (20.3). 
Even where the frustrating agent is clearly 
responsible for interfering with performance, 
the repressive subjects show far less rational 
aggression than the adjusted subjects. With 
delay, it is interesting to note the decreased 
tendency to blame the examiner by the 
adjusted subjects. 

On Examiner Devaluation (see Table 3), 
which represents the irrational component of 
aggression, there are no significant F ratios 
for any of the main variables or interactions. 
Thus while the adjusted subjects are highest 
in rational aggression, they express no greater 
hostility on “‘unreasonable”’ items. 

Displaced Hostility. As predicted, there is a 
significant F ratio (7.74) for the SI variable 
see Table 3). The high discrepancy subjects 
expressed significantly more hostility towards 
other persons than the low SI subjects. None 
of the other F ratios reached significance at 
the .05 level. On the basis of these results, it 
might be expected that high SI subjects would 
be more prejudiced than low SI subjects. 
Displacement is not only a function of the 
hibition of direct aggression but is also a 
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function of personality. For example, the 
repressors in the present experiment show no 
more displacement than the adjusted group. 

Aggression-Anxiely. Though the F ratio 
(3.37) for the interaction of K X SI (see 
Table 3) is not significant at the .05 level 
(3.98), a one-tailed test of significance is 
justified since it was predicted that the low 
defensive, high SI group (anxious) would 
show most and the low defensive, low SI 
group (adjusted) would show the least ag- 
gression-anxiety. The hypothesis is confirmed 
if delay is disregarded. The adjusted group 
expressed significantly lower  aggression- 
anxiety (2.4) than any of the other three 
groups. The anxious group expressed the 
greatest aggression-anxiety in the ND con- 
dition (3.7) but in the D condition, the mean 
was 2.8. That delay is a significant factor in 
the expression of aggression-anxiety is shown 
by the F ratio of 3.37 and the highly significant 
F ratio for the interaction of D X SI (7.30). 
It is interesting to note, on the other hand, 
that the repressors were the only group that 
increased in aggression-anxiety after delay 
(1.3). It may be that in time these subjects 
became somewhat aware of their hostile 
impulses. 

Delay. The hypothesis concerning the effect 
of delay on feelings of anger did not come out 
as predicted with the direct technique, but 
with the projective instructions significant 
differences were obtained. The defensive 
subjects (HK) inhibit the direct expression of 
hostile feelings which tend to hide any de- 
crease over a period of time. The projective 
test gives a more accurate estimate of the 
feelings of anger since inhibitory mechanisms 
are diminished. 

As far as hostile attitudes are concerned, 
delay did seem to produce a decrease in 
Examiner Blame contrary to the prediction 
but, as the triple interaction shows, the 
decrease seems to occur primarily for the well- 
adjusted subjects (6.2) and for the high 
defensive, high SI subjects (3.8). Hostile 
attitudes do tend to persist for certain types 
of personalities, namely, the repressors and 
anxious subjects. 

SUMMARY 

On the basis of Self-Concept theory, four 

types of personalities were described based on 
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the nature of the defenses employed in main- 
taining the self-structure. Hypotheses con- 
cerning the management of hostility were 
derived from the interaction of SI discrepancy 
and defensiveness: Subjects with low defensive, 
low SI discrepancy (adjustive) should express 
the strongest feelings of anger and hostile 
attitudes, while high defensive, low SI subjects 
(repressive) should show the least feelings of 
anger and hostile attitudes; high SI subjects 
should displace more hostility to other ob- 
jects than low SI subjects; the low defensive, 
low SI group (adjustive) should show the 
least aggressions anxiety while the low defen- 
sive, high SI group, (anxious) should ex- 
press the most aggression-anxiety. 

It was also predicted that delay following 
the arousal of hostility would produce a 
decrease in feelings of anger but that negative 
attitudes would tend to persist. 

Eighty undergraduate males 
introductory in psychology 
selected on the basis of high and low scores 
on an SI inventory and the K scale (defensive- 
ness) of the MMPI. These four groups of 20 
subjects were randomly divided into an ND 
and a D group (2 X 2 X 2 design). In the 
ND condition, hostility measures of direct 


from an 


course were 


anger, projective anger, examiner blame, 
examiner devaluation, displaced _ hostility, 
and aggression-anxiety were administered 


immediately after a frustrating intelligence 
test. In the D condition, the same measures 
followed 20 minutes of an interpolated neutral 
task. The results showed that: 

1. On hostile feelings, the interaction of 
SI discrepancy and defensiveness was sig- 
nificant at the .05 level. The highest feelings 
of anger were expressed by the adjusted 
subjects and the least expressed by the re- 
pressors. 

2. On the expression of rational aggression 
(Examiner Blame) towards the examiner, the 
triple interaction of D X SI X A was sig- 
nificant at about the .05 level. The adjusted 
group had the highest score immediately after 
frustration but then dropped more than any 
group after 20 minutes delay. The repressors 
showed the smallest amount of rational 
aggression in the ND condition and actually 
increased slightly after delay. 

3. On irrational aggression (Examiner De- 





AND PuiLip WoRCHE! 


valuation), there were no significant main o; 


interaction effects. 


4. As predicted, the high SI group dis. 


placed more hostility than the low SI group 
5. The anxious group showed most, whi 


the adjusted group showed the least, aggres. 


sion-anxiety in the ND condition. The inter. 
action of D X SI was significant, indicating 
that delay is an important factor in the ey. 
pression of aggressjon-anxiety particularly 
with low SI subjects. 

6. With delay, there significant 
decrease in feelings of anger on the projectiy: 
test but no significant change occurred op 
the direct technique. With delay, there was 
decrease in rational aggression primarily for 
the low defensive, low SI (adjusted) and ¢} 
high defensive, high SI (distorters) subjects 
For the other two groups, repressors an 
anxious subjects, examiner blame persist 
with little change. 


was a 
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PASSIVE PRIOR REFUTATION OF THE SAME AND « 
ALTERNATIVE COUNTERARGUMENTS' = 
WILLIAM J. McGUIRE esl 
Department of Social Psychology, Columbia University eff 
the 
HE present study is one of a series only of the very counterarguments refuted, ke 
comparing the relative effectiveness of but also of novel counterarguments agains { ab 
various pretreatments in making the belief. Such generalized immunization | 
initially unquestioned beliefs resistant to could derive from either of two mechanisms, cn 
change when subsequently the person is Pre-exposure might shock the person into by 
forced to expose himself to strong counter- realizing that the “truisms” he has always = 
arguments against the belief. The theoretical accepted are indeed vulnerable, thus provoking a 
point of departure of the present and the him to develop a defense of his belief, with os 
earlier studies is the postulate of “selective the result that he is more resistant to the . 
exposure.” Initially strong appearing beliefs, strong counterarguments when they come, | | 
it is assumed, are actually quite vulnerable Alternatively the refutations involved in the rv 
under forced exposure to strong counter- pre-exposure might make all subsequently yo 
arguments, because such beliefs tend previ- presented counterarguments against the belief a 
ously to have been overprotected. By selective appear less impressive. The other relevant . 
avoidance of counterarguments in the past, study (McGuire & Papageorgis, 1961) showed | .. 
the person has kept his beliefs extreme, but that when the subsequent attack is by the - 
has also left himself unpracticed in their same counterarguments as were previously 7 
defense and unable to deal with strong counter- refuted, more resistance is conferred when the a8 
arguments when exposure to them is forced. subject passively attends to the refutation rs 
The postulate of “selective exposure” is no than when he himself actively participates in pi 
novelty in communication research. Since it. That study corroborated the prediction } 4 
Klapper (1949) called it “the most basic from “selective exposure” that the person is s0 jer 
process thus far established by research on little able to defend these previousiy un- | a 
the effects of mass media,” it has stimulated questioned truisms that the prior defensive : 
much research, designed either to account for _ session is not used effectively when the subject a 
it in a theoretical system (Festinger, 1957) or is called on to help refute the counterargu- ts 
to use it predictively (Janis, 1957). Its general ments. Hence, the particular counterarguments na 
relevance to the present problem is to suggest presented are more thoroughly refuted and ah 
that beliefs can be “inoculated” against subsequently better resisted in the passive ) ary 
persuasion in subsequent situations involving defense situation than in the active. of | 
forced exposure to strong counterarguments The three main hypotheses tested in the - 
by pre-exposing the person to the counter- present study all deal with interaction effects | ori 
arguments in a weakened form that between type of prior refutation defense and | 7 
stimulates—without overcoming—his defenses whether the subsequent attack involves effe 
(Lumsdaine & Janis, 1953). strong forms of the same counterarguments | the 
The hypotheses tested in the study reported as were refuted or of novel counterarguments. / 4 
here were suggested by interpretations of the The first hypothesis would limit the previous r 
outcomes of two previous studies. One of them finding (McGuire & Papageorgis, 1961)— | ;,.. 
(Papageorgis & McGuire, 1961) demonstrated that passive exposure to the refutations | ,., 
that prior exposure to refuted counterargu- more effective than active in producing | sho 
ments tends to make a belief more resistant resistance to persuasion—to situations M | .., 
to subsequently presented strong forms, not which the subsequently presented strong } |; 
| This wa re ee counterarguments are the very ones previously is |i 
nis study was supported, in part, by a grant from - : 
the Office of Social Science, National Science Founda refuted. When, on the contrary, the subsequent it is 
tion. A partial version of this paper was presented at ©Xposure is to novel counterarguments against the 
the 1960 APA meetings in Chicago (McGuire, 1960 the belief, the present hypothesis holds that own 
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RESISTANCE TO PERSUASION 


active defense is relatively more effective 
(and may even be absolutely more effective) 
in producing resistance. Of the two hypoth- 
eized mechanisms for the immunizing 
dicacy of the refutation defense—shocking 
the subject into developing support for his 
belief by acquainting him with its vulner- 
ability and making subsequent counter- 
arguments seem less impressive by refuting 
the earlier ones—the first is better provided 
by the active defense (wherein his inadequacy 
makes him feel more vulnerable) and the 
latter, by the passive defense (in which the 
counterarguments get more thoroughly re- 
futed). Since the first mechanism is likely to 
be more important in providing resistance to 
strong forms of novel counterarguments and 
the second mechanism, to strong forms of the 
same counterarguments, the interaction pre- 
diction follows. 

A second hypothesis deals with combina- 
tions of defenses: lengthening the prior ex- 
posure session to allow both active and passive 
defense should produce greater resistance 
increment over the single prior defense when 
the subsequent strong exposure is to the same 
counterarguments as had been refuted. Less 
difference in resistance between the single 
and double defense is predicted when the 
subsequent strong exposure is to novel counter- 


| arguments. The basis of this hypothesis is 


~~ 


~~ 


— 





similar to that of the first one. The resistance 
to the specifically refuted counterarguments 
increases with the thoroughness of the prior 
refutation; the resistance to novel counter- 
arguments depends more on the shock value 
of the pre-exposure, which is more likely to 
lessen than increase with the thoroughness of 
prior refutation of the counterarguments. 

The third hypothesis deals with an order 
effect within the double defense condition: 
the passive-active sequence of prior refutation 
should be more effective than active-passive 
if the subsequent attack involves strong 
forms of the specifically refuted counter- 
arguments; while the active-passive order 
should be the more effective against novel 
counterarguments. Similar reasoning again 


underlies this hypothesis. The active defense 
i likely to be more effectively carried out if 
it is preceded by the passive since in this case 
the subject does not have to depend on his 
own meager repertory of defenses when he is 
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called upon to refute actively the counter- 
arguments, but has at his disposal the ref- 
utations to which he has just passively 
attended. Hence, the counterargument to 
which he is pre-exposed should be more 
effectively refuted with the passive-active 
order, and he should show more resistance to 
these specific counterarguments. 

The active-passive order, on the other 
hand, is predicted to be a more effective 
preparation for subsequent exposure to strong 
forms of novel counterarguments. When the 
active defense comes first, the subject has 
already had the dismaying experience of 
being confronted with novel counterarguments 
which he is hard pressed to refute, but he 
has then seen in the passive defense how 
effectively these formidable-seeming counter- 
arguments can be refuted by an expert. After 
this reassuring experience, the subject is 
likely to be less dismayed by the further 
novel counterarguments used in the attack. 
He should tend to recall that the earlier 
counterarguments also appeared formidable 
when he was first called upon to refute them 
actively, and yet were shown to be quite 
refutable in the subsequent passive defense. 


METHOD 
General Procedure 


All subject’s took part in two one-hour experimental 
sessions. The first was devoted to the active and passive 
belief defenses; the second, to presenting the strong 
counterarguments against the belief. The persuasive 
aspects of the experiment were disguised by repre- 
senting the study as an attempt to obtain norms for 
analytic thinking ability in high level personnel, a 
representation given credence by some of the tasks the 
subjects were called upon to perform. 

First session. All the defenses to which this session 
was devoted consisted of a mention of two counter- 
arguments against one of the four experimental beliefs, 
together with a refutation of these counterarguments. 
The refutations took one of two forms: either the 
subject was asked to read a mimeographed message 
(about 690 words long) that mentioned two counter- 
arguments and refuted them in detail (passive defense); 
or the subject was given a sheet listing the two counter- 
arguments and told (without any specific guidance) to 
show how each of these counterarguments could be 
refuted (active defense). The subject received passive 
only, or active only, or active-then-passive or passive- 
then-active defenses on one or another of the four 
experimental issues. In the passive defense condition, 
the subject was given 5 minutes to read the message 
and to select and underline the key clause in each 
paragraph (this task being introduced to bolster the 
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“analytic ability” pretext for the experiment and to 
insure attention to the material). In the active defense 
condition the subject was given 10 minutes to refute 
the two counterarguments and was told that what he 
wrote need not reflect his own opinion and could be 
based on any relevant material he found in earlier 
parts of the test. 

Second session. Two days later, the subject took 
part in the second session which was devoted to 
messages containing strong counterarguments against 
his belief. These messages were similar in appearance 
to the refutational messages used in the passive defenses 
conditions, each being about 600 words long and 
mentioning two counterarguments against the belief. 
However, while the defensive messages had been 
devoted mainly to refuting the counterarguments 
mentioned, these second session messages were devoted 
primarily to showing how the counterarguments were 
valid. Here again the subject was given 5 minutes to 
read and underline the crucial phrases of the message 
on each issue. In half the cases, the two counter- 
arguments in these strong-attack messages were the 
same two as the subject had previously seen refuted; 
in the other half of the cases, the two counterarguments 
were novel. 

After completing this “analytic reading and under- 
lining” test, the subject was given the opinion question- 
naire and asked to indicate his personal beliefs on the 
issues regardless of what materials may have been 
presented to him in the test. This questionnaire was 
justified to the subjects as needed to determine if the 
testee’s attitude towards the content of the passages 
he had read affected the verbal skills in which we were 
supposedly interested, 

Afterwards the subject filled out a questionnaire 
designed to test the effectiveness of the manipulations 
and experimental conditions and was given a detailed 
explanation of the nature of the experiments and the 
deceits used. Particular effort was made to assure the 
subject that all material presented in the messages was 
designed solely to persuade him in one direction or the 
other and that no special credence should be given to 
any of the arguments simply because it appeared in 
these messages. 


Materials 


Issues. Four health issues were selected on the basis 
of earlier studies which indicated that beliefs on items 
were homogeneously extreme in the college population. 
Three of the issues were used in a previous study 
(McGuire & Papageorgis, 1961): “Everyone should get 
a chest X ray each year to detect any possible TB 
symptoms at an early stage”; “‘The effects of penicillin 
have been, almost without exception, of great benefit 
to mankind”; “Everyone should brush his teeth after 
every meal if at all possible.” A fourth, used for the 
first time in the present study, was: “Everyone should 
see his doctor at least once a year for a routine medical 
check-up.” 

Opinion questionnaires. Opinions on these issues were 
measured by having the subject indicate on a 15-point 
graphic scale the extent of his agreement with state- 
ments bearing on one of the issues. There were 17 


WituraM J. McGuire 


statements in all, 4 on each issue, the seventeenth 


being a repetition of an earlier item (to serve as , 
reliability check). The opinion scores to be reported ar 
the mean number of points the subject obtained on the 
four items dealing with the issue in question. 
Defensive messages.* Each of the 600-word passages 
used in the passive defense consisted of three par, 





graphs. The first stated the belief and mentioned tyo | 


counterarguments against it, together with the remark 
that each was refutable. The second and third par 
graphs were devoted to a detailed factual refutation of 
each counterargument. There were two such Messages 
on each issue (hence, eight in all) in order to counter 
balance material and implement the design of having 
half the subjects receive, in the second session, strong 
forms of the previously refuted counterarguments, and 
the other half, novel counterarguments. These messages 
were given appropriate titles (“Some Misguided 
Attacks on Penicillin,” etc.) but not specific source 
attribution. They were mimeographed, single spaced 
on a letter-sized sheet. 

In the active defense condition the refutational 
material consisted of a sheet of paper with the same 
title and a brief statement of two counterarguments 
against the belief, together with instructions to write 
refutations of these counterarguments (not necessarily 
expressing one’s own view and using any material one 
wished from previous parts of the test). Here again 
there were two different forms for each issue, each 
presenting an alternative pair of counterarguments to 
be refuted. 

In the double defense conditions, the subject always 
received the same pair of counterarguments for refuta 
tion in the active as in the passive defense. 

Strong counterargument messages. These messages, 
attacking the beliefs, were similar in length and format 
to the essays in the passive defense condition but 
differed radically in content. The first paragraph 
suggested that informed opinion has recently begun to 
question the “truism” and mentioned two counter- 
arguments against it. The next two paragraphs attacked 


_ 








the belief by describing detailed factual evidence | 


bolstering the counterarguments. Each message was 
suitably titled (“Some Drawbacks Involved in the Use 
of Penicillin,” etc.). There were two messages attacking 
the belief on each issue, the two developing alternate 
pairs of counterarguments. 


Design and Subjects 


A total of 168 subjects took part in both sessions. In 
order to obtain data for all the comparisons involved in 
the hypotheses and also comparable data on some 
needed controls, a complex design was employed. There 
were four types of defense (active, passive, active 

2 All 24 of the defensive and attacking messages 
used in this study have been deposited with the Amen 
can Documentation Institute. Order Document No 
6866 from ADI Auxiliary Publications Project 
Photoduplication Service, Library of Congress; Wash 
ington 25, D. C., remitting in advance $2.00 for micro- 
film or $3.75 for photocopies. Make checks payable to 
Chief, Photoduplication Service, Library of Congress 
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RESISTANCE TO PERSUASION 


passive, and passive-active), which could then be 
followed by no attack, attack with the same counter- 
arguments, or attack with novel counterarguments, 
making for 12 conditions in all. In addition, there were 
two control conditions: an attack only condition, with- 
out prior defense, to ascertain the effect of the strong 
counterarguments when the beliefs had not been 
immunized; and a condition involving neither attack 
nor defense, to ascertain the “‘initial’’ belief levels. 
Each subject served in 4 of these 14 conditions, a 
diferent issue being used in each of his 4 conditions. 
The combinations of conditions given to any one 
subject were limited only by the requirements that all 
subjects be given tasks taking the same amount of 
time (one hour) in each session. The number of subjects 
serving in each condition is shown in Table 1. 

Since the design called for four issues and two sets 
of material on each issue, counterbalancing the material 
required eight subgroups in the 12 attack-and-defense 
conditions, and in the thirteenth (attack only) condi- 
tion; while in the fourteenth condition (neither-attack- 
nor-defense) four different subgroups were required, 
one on each issue. 

The significance levels reported in the Results 
section are based on F tests in which the “error” 
variance is the residual variance in the conditions being 
compared with the treatment and (where appropriate) 
individual differences variance removed. 

The subjects were 168 students enrolled in the 
introductory psychology courses at the University of 
Illinois. Almost all were either freshmen or sophomores 
and about half were males and half, females. 


RESULTS 
General Effects 


As can be seen in the top row of means in 
Table 1, the four defensive conditions did not 
differ appreciably among themselves in regard 
to their direct strengthening effects on the 
beliefs, prior to exposure to counterarguments. 
Even the largest between-means differences, 
0.80 points (between the active-passive and 
the passive-active conditions), is significant 
only at the .25 level. 

Furthermore, the mean belief level after 
these defense only treatments does not differ 
sizably from those in the neither-defense-nor- 
attack control conditions. The overall mean 
of 12,99 in the four defense only conditions is 
only slightly superior (F < 1) to the 12.78 
mean in the control condition. This lack of 
direct strengthening effect from the prior 
defenses is particularly interesting in view of 
the hidden reserves of resistance against 
subsequent strong counterarguments which 
(as will be discussed below) were conferred on 


the beliefs. 
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TABLE 1 


Bewier STRENGTH (on a 15-Point Scale) In THE VAR- 
10Us CONDITIONS BEFORE AND AFTER ATTACKS BY 
THE SAME OR DIFFERENT COUNTERARGUMENTS 
FROM THOSE TO WHICH THE SuByEcT Hap 
BEEN PRE-EXPOSED DURING THE DEFENSE 


(Number in parentheses is the number of 
cases for the cell) 
































Type of defense condi- | 
tion prior to attack | Nei-| At- 
| ther | tack 
a , ai. 
Time when beliefs pom | pas- | tack | out 
were measured ~ ; : 
| tive | sive nor | prior 
ac~ | Pas- | then | then | de- | de- 
tive | sive | nas- | ac- | fense | fense 
| sive | tive 
ne Ras tet SE Se 
| | | | 
After defense prior to | 12.94) 12.75) 12.57) 13.37] 
attack (48) | (48) | (24) | (24) | 
After defense, plus 10.66) 11.47) 12.15) 12.18) 12.78] 8.60 
attack by same (48) (48) | (48) | (48) (96) (48) 
counterarguments | 
After defense, plus 11.42) 10.62) 10.92) 10 a 
attack by novel (48) | (48) | (48) | (48) 
counterarguments | 
} | | 





The beliefs were considerably weakened 
(to a mean of 8.60) by the strong counter- 
arguments when these had not been preceded 
by any defense. This drop of 4.18 points from 
the control level is significant at the .001 
level. The impact of these strong counter- 
arguments was considerably lessened when 
they were preceded by a defensive treatment. 
The overall mean belief in the four defensive 
conditions after the defenses and strong 
attacks was 11.27. This does represent a 1.51 
point drop from the control level (p < .001), 
but it leaves the beliefs well above the 8.60 
point level to which they were reduced when 
the strong attack had not been preceded by a 
defense. 


Comparative Immunizing Effectiveness of the 
Defenses 


Single defense conditions. The first hypothesis 
dealt with an interaction effect in the single 
defense condition: that while passive pre- 
exposure to refuted counterarguments tends 
to be superior to active in conferring immunity 
to subsequent strong forms of the same 
counterarguments, the active tends to be 
superior (or at least less inferior) to the 
passive in conferring resistance against novel 
counterarguments. As can be seen by com- 
paring the four cells in the lower-left quadrant 
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of Table 1, this prediction was substantially 
confirmed. The interaction effect is in the 
hypothesized _—direction—passive _ defense 
superior against the same counterarguments, 
and active, against novel—and significant on 
the .02 level. 

The nature of this interaction effect can be 
seen more analytically if we examine the 
comparative effectiveness of active and 
passive defense against the same and novel 
counterarguments, separately. With respect 
to immunization against strong forms of the 
same counterarguments, we find the final 
belief levels in the passive and active con- 
ditions to be 11.47 and 10.66, respectively. 
The difference between these subgroups is 
significant only at the .08 level, but cor- 
roborates the more significant result of the 
McGuire and Papageorgis (1961) study. 
Against novel counterarguments, we find the 
final belief levels in the passive and active 
defense condition to be 10.62 and 11.42, 
respectively. Since this difference is significant 
only at the .13 level, it remains to be shown 
definitively that the active defense is actually 
superior to the passive with respect to con- 
ferred resistance against novel counter- 
arguments. The significant interaction does, 
however, demonstrate that any superiority of 
passive over active defense is less pronounced 
against novel than against the same counter- 
arguments. 

In this study, as in that by Papageorgis and 
McGuire (1961), we find that the pre-exposure 
(either actively or passively) to refuted 
counterarguments produces considerable 
generalized immunity. In neither active nor 
passive condition is there a significant differ- 
ence between the resistance to same and novel 
counterarguments, and in the two conditions 
combined the mean beliefs after the same and 
novel strong counterarguments (11.06 and 
11.02, respectively) is only trivially different. 
This finding is encouraging in regard to the 
possibility of making given beliefs immune to 
persuasion even when we cannot foretell which 
counterarguments will eventually be offered 
against them. 

Single vs. double The 
hypothesis dealt with an interaction between 
single double defense and subsequent 
exposure strong forms of the same vs. 
novel counterarguments. An inspection of the 
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four means shows that both marginals, as 
well as the predicted interaction, attained 
appreciable levels of significance. The same 
vs. novel counterarguments marginal was 
significant at the .01 level: over the four 
defensive conditions, subsequent exposure to 
strong forms of the same counterarguments 
reduced the beliefs only to 11.61 points while 
novel counterarguments reduced them to 
10.92 points. 

The single vs. double-defense marginal 
shows the least differential (significant only 
at the .10 level): after the single defense the 
strong counterarguments reduced the beliefs 
to 11.04 points and after the double defense, 
to 11.49 points. 

The interaction was in the predicted 
direction: the single defense was almost as 
effective against strong forms of novel counter- 
arguments as against strong forms of the 
same counterarguments (the postattack means 
being 11.06 and 11.02, respectively), while the 
double defense was much more effective against 
the same than against novel counterarguments 
(the postattack means being 12.17 and 10.82, 
respectively). This interaction is significant 
at the .01 level. To state this interaction effect 
in another way, the slight marginal superiority 
of the double over the single defense just 
mentioned (p = .10) was due entirely to the 
subconditions in which the subjects were 
subsequently exposed to the strong forms of 
the same counterarguments had _ been 
refuted in the defensive session. In this 
“same” condition the superiority of double 
over single defense was significant at the .001 
level. The implication is clear that it is worth 
belaboring the counterarguments to the extent 
of a double (active and passive) refutation if 
and only if the subsequent attack is going to 
involve the very counterarguments refuted. 

That the single defense is as good as—in 
fact, trivially (p = .60) better than—the 
double in conferring resistance to novel 
counterarguments fits in well with the under- 
lying postulate that the efficacy of the prior 
refutational defense in producing resistance 
to novel counterarguments derives mainly 
from provocative impact of pre-exposing the 
belief to counterarguments, which brings 
home to the subject that the truism is indeed 
attackable and stimulates him to bolster his 
belief. Elaborate double refutation of the 
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counterarguments adds little since the attack 
involves novel counterarguments (and may 
subtract from the defense stimulating impact 
by refuting too thoroughly the counter- 
arguments to which the subject is pre-exposed 
during the defense). Thus this interaction 
eflect between the single vs. double defense 
and the same vs. novel counterargument 
variables lends itself to the same theoretical 
interpretation as did the significant interaction 
between the passive vs. active single defense 
and the novel counterargument 
variables. 

The double defense conditions. The third 
hypothesis dealt with an interaction effect 
predicted in the double defense condition: that 
the active-then-passive sequence of defenses 
would prove more resistance producing 
against strong novel counterarguments, while 
the passive-then-active sequence of defenses 
would be more effective against strong counter- 
arguments similar to those refuted. As can be 
seen by inspecting the four relevant means in 
Table 1 (lower center portion), the interaction 
efiect between the order of defenses and the 
same vs. novel counterarguments variables 
is in the predicted direction but trivial in 
magnitude. Neither is there any differential 
eflect in the sequence margin per se: the 
strong counterarguments reduced the beliefs 
to 11.54 points after the active-then-passive 
sequence, and to 11.45 points after the passive- 
then-active sequence of defenses. (The F of 
this difference is less than 1.) The only sig- 
nificant difference in the double defense 
condition is in the same vs. novel marginal: 
after the double defenses, the beliefs were 
weakened only to 12.17 points by strong 
counterarguments that were the same as had 
been refuted, but to 10.82 points by novel 
counterarguments. This difference is sig- 
nificant on the .001 level. As mentioned in the 
previous section, this sizable same-vs.-novel 
elect in the double defense conditions con- 
trasts with the lack of such effect in the single 
defense, which contrast tends to confirm the 
initial postulate of the study. 

While the preceding discussion demon- 
strates that the eight defensive conditions 
differed considerably among themselves in 
their immunizing effectiveness, it should be 
noted also that all conferred on the belief a 
Significant resistance to the subsequent strong 


Same _ VS. 


attack. Even the least effectively immunizing 
defense (single passive defense, with subse- 
quent attack by novel counterarguments) 
left the beliefs still appreciably stronger after 
the attack (at a mean of 10.62 points) than 
they were left in the attack only condition, 
where the resulting mean was only 8.60, the 
difference being significant at the .001 level. 

All the beliefs being defended and attacked 
in this study were of a highly homogeneous 
type: all were cultural truisms and all touched 
upon a highly involving and anxiety arousing 
topic, physical well-being. Perhaps this 
deliberate homogeneity is responsible for the 
obtained generality of effects across issues, in 
the sense that none of the interaction effects 
between issues and type of immunization 
reached conventional levels of significance. 
(However, issues as a main effect did con- 
tribute significantly (p < .05) to the variance; 
that is, the overall belief levels after both 
combined defenses and attacks did vary 
significantly among the four issues.) General- 
ization of these results to other types of issues 
would be unwarranted without further re- 
search. 

There are, in fact, theoretical reasons for 
hesitating to generalize the present findings 
to other types of beliefs. The “selective 
exposure”’ postulate used to derive the present 
prediction would yield quite different pre- 
dictions with other types of beliefs. Had the 
beliefs been controversial rather than truisms, 
the subjects would have been more practiced 
in defending them and, hence, would have 
participated more effectively in the active- 
defense condition and would have been less in 
need of a threatening, defense stimulating 
pre-exposure. Had the issues been less in- 
volving than these health ones, there would 
have been more to gain, in regard to moti- 
vating the subject to pay adequate attention 
to the material, from requiring his active 
participation in the defense. Hence, with 
different types of issue we would expect 
resulting differences not only in the size of 
the obtained effects, but even in their direc- 
tions. 

SUMMARY 
The postulate that cultural truisms tend 


to be overprotected as a result of the tendency 
toward “selective exposure,” together with 


“cc 
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the results of earlier studies, led to three 
predictions regarding the amount of resistance 
to subsequent strong counterarguments that 
would be conferred by pre-exposure to refuted 
counterarguments under conditions of active, 
passive, and combined refutation. The hypoth- 
eses dealt with interaction effects between 
amount of activity during the pre-exposure 
and whether the strong counterarguments 
used in the subsequent attack were the same 
as or different from those to which the subject 
had been pre-exposed. 

Each of the 168 college student subjects 
served in four conditions, each on a different 
issue. The complex design furnished data on 
the relative effectiveness of the four defensive 
conditions (active, passive, passive-active, 
and active-passive refutations of counter- 
arguments) in regard to direct strengthening 
effect, and immunization against the same, 
and against novel counterarguments presented 
in strong form 2 days later. 

Besides corroborating various results of 
previous studies, the present results confirmed 
two of the three interaction hypotheses. 

1. In the single defense condition, more 
immunity was conferred against the same 
counterarguments by passive than by active 
defense, while against novel counterarguments 
the active defense was the more effective 


(p = .02). 
2. In the double defense condition the 
predicted interaction effect involving the 
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sequence of the active and passive defenses 

was not found. 

3. As predicted, the immunizing superiority 
of the double over the single defense was 
found only when the subsequent attack 
involved the same counterarguments; against 
novel counterarguments, the single defense 
was as effective as the double (interaction 
significant at the .01 level). 
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HE present paper describes one part of a 

larger research project concerned with 

the relationships between childhood 
maladjustments, personality problems, and 
associated background factors—described at 
the time of their actual occurrence rather than 
retrospectively—and adjustment in young 
adulthood. Information contained in child 
guidance clinic case histories is being com- 
pared with personnel and psychiatric informa- 
tion in the Selective Service and military 
service records of the same persons at a later 
period. For the total group of male clinic 
cases from two child guidance clinics in 
Minnesota, information has been obtained 
from the Selective Service System, the national 
Army, Navy, and Air Force Record Centers, 
and the Veterans Administration. Both 
dinics began during the 1920s; the sample 
under consideration here includes men who 
were in service during World War II and 
after. 

An introductory report (Roff, 1956) in- 
cluded a preliminary predictive study, indi- 
cating that experienced clinical child psy- 
chologists could make significantly accurate 
predictions of service adjustment, good or 
poor, on the basis of the case histories. With 
this demonstration that meaningful predictions 
could be made on a global basis, it seemed 
desirable to identify the particular dimensions 
contributing to prediction. Relevant findings 
would have both theoretical and practical 
value and could provide useful leads for 
appraisal procedures outside the case history 
situation. Specified dimensions could also be 
recombined in potentially more effective 
patterns for predictive purposes. 

Criterion subgroups exhibiting various 
kinds of poor adjustment at the adult level 

‘This investigation was supported (in part) by 
USPHS Grant Number M-2218, from the National 
Institute of Mental Health; by Contract Number 
DA-49-007-MD-2015 with the Army Medical Research 
and Development Command; and by Contract AF 


18 (600) 454 with the Air Force School of Aviation 
Medicine 
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include those diagnosed as psychoneurotic, 
psychotic, character and personality disorders 
with and without bad conduct, ‘“psycho- 
somatic” disorders, sexual deviations, etc. 
One would expect that some case history 
information may be specifically predictive of 
only a single type of outcome but that some 
variables from the childhood period may be 
more broadly effective in predicting outcomes 
of more than one kind. The first of these out- 
come groups to be studied intensively was a 
sample of persons diagnosed as psychoneurotic 
while in service. Reports of predictions for this 
group in contrast to a control group judged to 
have made satisfactory service adjustments 
have been presented elsewhere (Roff, 1957, 
1960). 

The present paper is concerned with the 
application of the method developed for the 
prediction of adult psychoneurosis from child- 
hood histories to a group who exhibited severe 
bad conduct while in service. 


METHOD 


The subjects were 164 former child guidance clinic 
cases who had entered and been discharged from one 
of the military services. Half of these subjects had a 
record of severe bad conduct while in service, serious 
enough to lead to other than an honorable discharge or 
to a number of days AWOL or days of confinement 
(“bad time”) totaling at least 60. Although most of 
this behavior occurred in a military setting, a few 
subjects committed offenses in the civilian community 
severe enough to result in their discharge from service. 

The first major class of offense was repeated periods 
of AWOL, with or without other violations. A second 
major type of offense was theft or robbery under a 
wide variety of circumstances, either in a service 
situation or while on leave or AWOL. A third category, 
frequently including repeated offenses or in combination 
with other types of offense, included disobedience, 
refusal to obey an order, and violations of this nature, 
including assault on a superior. Other groups of offenses 
included repeated drunkenness, forgery, escape from 
detention, and breaches of regulations governing the 
possession or use of firearms. In general, this group of 
men committed violations for which they could have 
no reasonable hope of escaping detection and punish- 
ment. If a man went AWOL for a month, he could 
expect that this would be detected and that he would 
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be caught and returned (unless he had been placed in 
civilian confinement in the meantime). 

Individuals who had had frequent enough or serious 
enough offenses in the civilian community to lead to 
their rejection for military service were not included in 
the present sample, so some of the most serious offenders 
were excluded, thus restricting somewhat the range of 
the group studied. Although the policies regarding 
rejection varied somewhat from one time to another, a 
substantial number of serious offenders in all periods 
were excluded from service altogether. 

Many bad conduct subjects were diagnosed by 
service psychiatrists as having a personality disorder, 
but not one appropriate for a medical discharge. 
During World War II, offenders who were sent to 
disciplinary barracks were carefully studied psy- 
chiatrically, partly with a view to determining whether 
or not they should be returned to duty. More recently, 
it has been the practice for an offender to be referred 
to a psychiatrist to assist in determining whether he 
should receive a medical discharge on psychiatric 
grounds or an administrative discharge. Since the 
reports of these interviews give a somewhat more 
definite picture of the individuals involved than does a 
simple statement of offenses, abstracts from some of 
these are presented for illustrative purposes. 

This soldier has been in the Army for approxi 
mately eight months, and has been under psy- 
chiatric evaluation and treatment practically ever 
since he came into the Army. He is subject to periods 
of animosity and has developed psychosomatic 
symptomatology referred to various parts of the 
body, which has handicapped him in his duties. He 
has been convicted by court-martial five times during 
the past six months. 

He is a paranoid personality, chronic, severe, 
manifested by an habitual inappropriate attitude of 
being imposed upon or persecuted, nonpsychotic in 
degree. He has had six periods of AWOL ranging 
from ten to fifty-nine days. 

He is erratic, undependable, and argumentative, 
with temper tantrums. He has had repeated dis- 
ciplinary measures taken against him and has been 
constantly ostracized by the other men because of 
his laziness, belligerence, and unwillingness to 
cooperate. He is an anxious, restless, and egocentric 
individual who has always had phobias and con 
flicts with others, with a long record of maladjust 
ment in the Navy. 

These subjects have been studied without specific 
reference to combat performance because most of them 
never reached combat. Beebe and Appel (1958) ex- 
amined the relationships between various items of 
information obtained from precombat service records 
and psychiatric breakdown during combat, finding 
that men with disciplinary records broke down about 
twice as often as individuals without such history. The 
bad conduct of their groups was milder than that of 
the present subjects. 

Control subjects consisted of individuals who had 
reached and kept a grade of sergeant (or equivalent) 
or higher without any indications of disciplinary or 
mental health trouble at any point following entrance 
into service. They were matched in childhood IQ with 


the bad conduct cases because intelligence leyel was 
sometimes directly related to service career possibilities 
Control cases were drawn by taking the nearest appro- 
priate case, in terms of clinic case number, which had 
not previously been used. While this method provided 
comparable information for both control and experi- 
mental subjects, it meant that the controls differed 
from a random sample of the general population in 
having been dealt with by a clinic during childhood 
Some subjects had received treatment ranging from 
minimal to substantial in amount, while others had 
only been studied. 


Procedures for Analyzing Clinic Case Histories 


Information in the case histories usually covered the 
following kinds of items: behavior difficulties; mother: 
father; grandmother, aunts, etc. (of particular impor 
tance if they functioned in place of the parents); 
siblings; home and family situations; hea!th, including 
nervous mannerisms and speech; social adjustment 
outside the family; psychological test data; and psy- 
chiatric evaluation. 

Readers with no knowledge of the service records 
of the subjects abstracted all potentially significant 
information for certain of the categories listed above. 
These were then evaluated by other readers who had 
never seen the complete case history. For example, al 
the information about the mother was abstracted and 
evaluated by itself. A similar procedure was followed 
for siblings and sibling relations. Two practices were 
found desirable in order to lessen a possible loss of 
information in going from the full case history to these 
abstracts. 

1. Different opinions of the child were sometimes 
expressed by informants who had had an opportunity 
to observe him in different situations. In preparing 
abstracts, it was found important to identify all 
abstracted material by the character of the informant 
who furnished it (teacher, case worker, mother, psy- 
chiatrist, etc.). 

2. There was sometimes a difference between earlier 
and later descriptions of a youngster as a result of 
either treatment or merely the passage of time. It was 
considered desirable not to lose this chronological 
information in the course of the abstracting, so the 
dates were carefully recorded for all items in the 
abstract. 

While this is a highly multivariate situation, it 
seemed desirable to attempt to develop different single 
variables on an analytic basis before attempting to 
put them back together. It would be expected that 
some simplification of this multivariate situation would 
result if one or two leading dimensions could be found 
Various lines of work, including psychiatric considera- 
tions of disturbances of interpersonal relationships in 
various behavior disorders, the success of nominating 
techniques and buddy ratings in predicting combat 
performance (Trites & Sells, 1957; Williams & Leavitt, 
1947), and the current use of group procedures as an 
aid to diagnosis in some child guidance clinics suggested 
that the boy’s social adjustment in relation to other 
youngsters his own age might be an important area to 
investigate. 
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It was thus decided to see what could be accom- 
plished with the general variable of peer group social 
relationships and related itemg’of information. Ab- 
stracts were prepared, including all appropriate material 
in this category contained in the history, with explicit 
information about informant and date for each item of 
information. One class of information, excluded as 
unhelpful if not actually misleading, covered all state- 
ments by the mother indicating good peer group 
adjustment outside the home situation. Two factors 
contributed to making this information unsatisfactory. 
The mother may have had no accurate basis for com- 
paring her child with other children on this score, and 
sometimes a mother, tiring of contacts with the clinic, 
reported that everything was fine when information 
obtained from other sources at the same time indicated 
that this was not so. These observations are in line with 
the finding of Harris (1959) that for a group of children 
without serious problems, school personnel and the 
psychiatrist agreed more closely in the evaluation of 
children than did the mother with either. 

For the peer group adjustment dimension, experience 
led to the construction of a priority list for the weighting 
of informants (other dimensions would not necessarily 
have the same list of priorities). This was related to the 
opportunities persons in different categories had for 
actual observation of the youngster in interactions with 
his associates. Instructions to the readers for the use of 
this priority list were to examine first the data from 
informants with the highest priority and to keep 
working down the list as long as, and only as long as, 
it was necessary to reach a decision. The priority list is 
as follows: 

1. Persons, primarily teachers, who had an 
extended opportunity to observe him in a peer group 
situation when quoted directly. 

2. Visiting teachers or case workers when sum- 
marizing information from persons in Category 1 
without specific quotations; clear and definite 
formal diagnostic statements by psychiatrists which 
relate to social adjustment. 

3. Family members, except for favorable state- 
ments by mother. 

4. Statements about social adjustment by 
patient in interview, and statements by psychiatrist 
except as noted above; comments about personality 
by psychologists based on impressions obtained 
during the mental test situation. 

Along with the observations of the chronology of 
statements and the priority list for informants, a 
guide for evaluating information as positive, neutral, 
or negative in appraising peer group adjustment was 
developed. This has been presented in detail elsewhere 
Roff, 1957). Among the positive items were such 
things as all signs of liking by the general peer group, 
freedom from problems in class or on the playground, 
indications that girls liked the subject, especially at 
adolescence. Neutral items are exemplified by a shortage 
of friends without specific indications of being disliked, 
playing with younger children, etc. Items to be eval- 
uated as negative included all signs of active dislike 
by the general peer group, inability to keep friends, and 


being regarded as “odd,” “peculiar,” or “queer” by 
other children. 

Working with the abstracts according to the in- 
formant priority list and the list of information to be 
evaluated as positive, neutral, or negative, two graduate 
students in psychology at approximately the MA 
level made “blind” evaluations as to good or poor peer 
group adjustment during the period covered by the 
case history. Each student evaluated about half the 
abstracts. Judgments of “good” or “poor” were made 
only if the reader felt reasonably certain of a judgment. 
If the information seemed incomplete, a response of 
“undecided” was made. This required some judgment 
on the part of the reader, but the area in which judg- 
ment was required was narrowed markedly in com- 
parison with global predictions based on the entire 
case history. This procedure is intermediate between 
completely global predictions based on the entire case 
and more atomistic predictions based on a further 
fragmentation of the case history material. 


RESULTS AND DISCUSSION 

This procedure for making peer group 
evaluations was originally developed with 
subjects diagnosed as psychoneurotic during 
service. Applied unchanged to a new sample of 
subjects who showed severe bad conduct while 
in service and control subjects different from 
chose with which the procedure was developed, 
it yielded the results shown in Table 1. Those 
subjects whose earlier peer group adjustment 
was evaluated as poor showed significantly 
more bad conduct in service than did subjects 
whose earlier peer group adjustment was 
appraised as good. 

For purposes of comparison, results ob- 
tained earlier with the psychoneurotic sub- 
jects are shown in Table 2 (Roff, 1957, 1960). 
It can be seen that a similar discrimination was 
obtained for the psychoneurotic group. 

The problem of the early detection of 
individuals who are likely to exhibit later 
maladjustment of various kinds is generally 
recognized as an important one. It is commonly 
assumed in dealing with both physical and 
psychological difficulties that early treatment 
would have possibilities of benefit that might 
be lost if the difficulty were not detected until 
it had reached a later stage. It is difficult to 
find detailed results which support this 
assumption for the problems under considera- 
tion in this paper. It remains, however, an 
important working hypothesis. 

The combined results reported here indicate 
that the level of earlier social adjustment 
contributes significantly to the discrimination 
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TABLE 1 
EVALUATIONS OF PRESERVICE STATUS IN 
RELATION TO OUTCOME IN SERVICE: 
Bap Conpuct Group AND CONTROLS 





Good Appraisal 


Outcome endecided | Poor 
Jeeiowsase) Wal 

Good |} 47 | 14 | 21 

Bad conduct 20 16 | 46 


x? = 10.17; p < .01. 





TABLE 2 


EVALUATIONS OF PRESERVICE STATUS IN 
RELATION TO OUTCOME IN SERVICE: 
PSYCHONEUROTICS AND CONTROLS 








ee 
undecided 


Outcome | Good Poor 
Good 54 32 18 
Psychoneurotic 13 29 62 


x? = 24.71; p < .001. 





between groups showing adjustment difficul- 
ties while in service and “‘good” groups. This 
is in line with an impressive amount of evidence 
that in a situation where there is sufficient 
mutual exposure to permit thorough acquaint- 
ance, appraisal by peers is very effective in 
predicting subsequent military adjustive re- 
actions of various kinds (Rigby, Sayers, 
Ossorio, & Wilkins, 1957; Trites & Sells, 
1957; Wherry & Fryer, 1949; Wilkins, 1954; 
Williams & Leavitt, 1947). The school situ- 
ation is also an appropriate place for effective 
peer group evaluations, and it has been found 
that teachers can report accurately on the 
peer group reactions of children, at least in 
cases of marked behavior disturbances. 

This can be made more concrete by giving 
illustrations of teachers’ comments for the bad 
conduct subjects. The most frequent descrip- 
tions relating to the peer group situation 
indicate in one set of words or another that 
the boy is “mean” and is disliked by his 
associates. Sometimes this is accompanied by 
attempts to dominate other children. In other 
cases it is described simply as behavior which 
is antagonizing to others: “He always wants 
to be the leader, and if the other children do 
not do as he says, he is apt to be abusive and 
ugly to them”’; “His conduct is irreproachable 
in the schoolroom, but he can be very mean 
when out of sight of authority. The boys in 
his room do not like him. During the past year, 
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hardly a week went by without someone com- 
plaining about him’’; “The teacher had known 
of his hurting children on various occasions, 
The children disliked him and grew into the 
habit of blaming him for everything that went 
wrong”; “He had violent temper tantrums 
when crossed. He did not get along with any 
of his classmates and was always in fights 
with some of them”’; and “He feels picked on, 
wants to dominate, and is often the aggressor 
in fist fights, and the other boys are afraid of 
him.” This pattern of overt aggression toward 
other children was more pronounced than it 
was in the case of the psychoneurotic group. 

In a definitely smaller number of cases, the 
boy was described as a good-natured non- 
conformist: “Very much of a problem, just 
smiled and wanted to do what he pleased”; 
“There was nothing mean about him, and he 
was good-natured and generous. He is well- 
liked by the other children, but all alike 
consider him as undependable, untruthful, 
and irresponsible.” 

Groups who know one another as well as do 
members of the same grade school class are 
fairly well aware of disturbed behavior in one 
of their members, although they may have no 
clear-cut diagnostic term for it. The peer group 
reactions employed in the present paper, 
collected and recorded by clinic personnel, 
represent a prorftising if neglected type of data. 
Further systematizing of such information 
should lead to an improvement in its predictive 
value. It seems very likely that a systematic 
study of antagonizing youngsters, extending 
over most of the grade school period, would 
identify, in at least the worst 1 or 2% of the 
cases, a group which would be of major interest 
from a long-time mental health point of view. 
Obtaining appraisals for at least 2 or 3 years 
would allow both for such real shifts of be- 
havior as do occur and for unreliability in the 
observations of individual teachers. It seems 
possible to use this approach to develop a 
method that may prove more practically 
effective than any now available for the early 
location of individuals with mental health 
problems. 

In attempting a solution of the early detec- 
tion problem, it is not essential that a precise 
prediction of the specific adult difficulty be 
made. The finding of the present paper that 
two major adult adjustment areas, psycho 
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SOME ASPECTS OF SELF-CONCEPTIONS AND ROLE DEMANDS IN 
A THERAPEUTIC COMMUNITY 
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HIs study is an attempt to measure 

objectively the self-conceptions of pa- 

tients and personnel, and the role de- 
mands upon them, in. a psychiatric hospital 
where daily observation of hospital life over 
the past 10 years has indicated that patients 
have come to view themselves more as responsi- 
ble adults than did patients in previous years. 
The term role demand is used as suggested by 
Levinson (1959): “The role-demands are 
external to the individual whose role is being 
examined. They are the situational pressures 
that confront him as the occupant of a given 
structural position”’ (p. 173). From comparison 
of self-conception with role demand, an esti- 
mate of the suitability of a person to fill one 
or another position may be inferred. 

The Austen Riggs Center, an open private 
psychiatric hospital specializing in the treat- 
ment of borderline cases where all patients are 
in intensive psychoanalytic psychotherapy has, 
over the past 10 years, evolved into a com- 
munity in which the patients and staff share 
administrative responsibility and authority. 
Patients, individually and as a group, have 
actively collaborated with the staff to set and 
enforce standards of behavior as well as to 
create social inventions to enable the hospital 
society to run smoothly. 

Each patient is a citizen in the hospital 
community, having the privileges and responsi- 
bilities that accrue to being in the position of a 
citizen. He votes for representatives to various 
agencies, votes on policies and procedures 
regarding community life, is expected to 
exercise his rights as a citizen, and to abide by 
the agreed-upon customs, rules, and laws. 

Patients are the principal administrators of 
a work program in which all patients are 
required to participate for a minimum of one 
hour daily. The jobs are those which would 
otherwise have to be done by hospital per- 
sonnel, such as some aspects of housekeeping, 
maintenance of grounds and buildings, con- 
struction and repair of equipment and furni- 
ture, and secretarial work. The money saved 
as a result of the work accomplished is con- 
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tributed to a Patient Aid Fund. Since all the 
work is carried out during a common time, 
during which each patient is aware that every 
other patient is working at some assigned job, 
the program is called the Common Work 
Program. Failure to meet a work assignment 
is considered a community problem, and 
patients who absent themselves from their 
jobs are confronted by designated agencies of 
patients and staff. 

An Activities Program of instruction in arts, 
crafts, drama, and academic courses taught by 
a staff of professional artists and teachers is 
available. Participation in activities is elective, 
and it is the design of the department to have 
people approach the activity areas as students 
rather than as patients being treated by 
traditional occupational therapy. The teachers 
tend to view and respond to their students as 
they react to others they instruct who happen 
not to be hospitalized psychiatric patients; 
the tendency of the instructors is to be more 
teachers than adjunctive therapists. The 
position of student in the Activities Program 
is, thus, like that of student in an adult educa- 
tion program. 

All patients live in a building that resembles 
a country inn. Each cares for his own room 
and determines for himself how he will spend 
his leisure time. There is no segregation by 
age, sex, or diagnosis, and any behavior that 
is socially appropriate, such as parties (with 
drinking in moderation), tournaments, dating, 
etc., is an accepted part of life in the hospital. 
Many expectations and responsibilities similar 
to those which stem from role demands of life 
outside the hospital impinge on patients 
when they fill the various positions available 
within the hospital. If a patient creates a 
problem for the community, formally es- 
tablished agencies consisting of both patient 
and staff members confront him, seeking a 
social solution to the problem while leaving 
the psychodynamics of his behavior for ex- 
ploration in his psychotherapy. 

It is probably evident from this description 
that the considered policy of the members of 
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the hospital society, both patients and per- 
sonnel, has been to create a community where 
patients can have an image of themselves as 
worthwhile, responsible, and productive people 
and where the personnel can view them in a 
like manner. Since there is no master plan to 
reach the intended goal, each new aspect of the 
hospital organization must emerge from joint 
patient-staff agreement on ways of main- 
taining the more adult and mature functioning 
of patients while they are hospitalized. Pa- 
tients today seem to view themselves as more 
responsible adults than did patients in previous 
years, although there has been no change in 
admission policy.’ 

The self-conception of the hospitalized 
psychiatric patient is a function of the role 
demands upon him as well as of his ego or- 
ganization. A psychiatric hospital ideally 
should make available to the patient a range 
of positions within which he can find not only 
some that are congruent with his ego organiza- 
tion and self-conception at the outset of 
treatment, but also others that are fitting for 
him as he changes and help him prepare for 
return to life outside. When, reintegrated, the 
patient can cope with role demands beyond 
the ones that exist in the hospital, he may 
seek social positions that are fitting with his 
new level of ego organization. 

A number of social positions were selected 
for examination and comparison in regard to 
the role demands associated with them, and in 
regard to the self-conception of patients and 
personnel. We were particularly interested in 
finding a measure of the way patients and 
hospital personnel differentiate the role de- 
mands of these positions, and in determining 
which role demands are or are not congruent 
with their self-conceptions. 

To obtain a measure of role demand, patients 
and staff were asked to rate various social 
positions on a number of adjectival scales in 
the manner described by Osgood (1952) as the 
semantic differential technique. In addition, 
ratings obtained for the word “me” could be 
compared with those obtained for social posi- 
tions to determine with which category of 


' For more details of the development of the hospital 
community and of the research program of which this 
paper is a part, see Kubie (1960), Polansky, Miller, 
= White (1955), and Polansky, White, and Miller 
(1957), 


positions the subject’s self-conception was most 
similar. 

The semantic differential technique makes 
it difficult for a subject to base his ratings on 
preconceived assumptions concerning the 
intended meaning of any set of ratings, and, 
thus, deliberately to control them. If for 
example a subject is to rate the concept 
“adolescent,” “mental patient,” or “worker 
in the Common Work Program” on a seven- 
point scale from hard to soft, or slow to fast, 
or beautiful to ugly, it is difficult for him to 
know what the “correct” rating is supposed 
to be. Especially as many concepts are rated 
along many scales while the subject is con- 
strained from comparing his various ratings, 
it seems reasonably certain that results cannot 
be deliberately manipulated. 


METHOD 


The following social positions were selected for study, 
because they have received most attention from both 
patients and staff at this hospital: 

1. One category of positions can be considered 
unique for this hospital (though, of course, analogous 
ones may exist in other hospitals). These Hospital- 
Created social positions include Worker in the Common 
Work Program, Citizen in the Riggs Community, and 
Student in the Activities Program. 

2. Another group of positions comprises some of 
those available to adults outside of a hospital, though 
they have counterparts in the hospital community 
(we refer to these as usual adult positions). They 
include Worker, Citizen, and Student. 

3. A third class (Mental Patient positions) consists 
of those that refer more directly to the psychiatric 
disorder—Patient at Riggs and Mental Patient. 

4. A fourth category—Adult, Adolescent, Child, 
and Baby—was included to allow comparisons from 
which inferences could be drawn with regard to the age 
levels that are associated with the various positions. 


Subjects 


Thirty-eight adult patients and the 33 members of 
the professional staff of the hospital (10 members of 
the clinical and research staff, 9 residents in psychiatry 
and clinical psychology, 6 activities instructors, 8 
nurses) participated in the study. The patients (20 
women and 18 men) ranged in age from 18 to 51 years 
with a median age of 25 years. Seventeen were diagnosed 
as borderline psychotic or psychotic and 21 were 
diagnosed as neurotic or neurotic character disorder. 
The median length of stay for all patients was 8.75 
months with a range from 3 days to 5 years. 


Measures of Role Demand and Self-Conception 


Rating scales based on the semantic differential 
technique were administered to all subjects. Ratings 
were made along seven-point scales of 15 adjectival 
opposites for each of the 12 social positions and for 
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the word “me.” The adjectival scales in the order pre- 
sented were: cruel-kind, masculine-feminine, sick-well, 
active-passive, necessary-unnecessary, calm-excitable, 
therapeutic-untherapeutic, weak-strong, irresponsible- 
responsible, slow-fast, beautiful-ugly, real-unreal, good- 
bad, hard-soft, important-unimportant. The semantic 
differential was administered to patients in a group 
meeting called for the purpose, and to personnel at 
a regular staff meeting. 


Instructions 


All subjects were given the following instructions 
which are essentially identical to those used by Jenkins, 
Russel, and Suci (1958). 

The purpose of this experiment is to discover the 
meaning of certain words by getting your rating of 
the words on a set of descriptive scales. Attached 
are 13 sheets each with a set of scales and each with 
a different word printed at the top to be rated on 
each of the scales. I wish you to rate the words on 
the basis of what they mean to you. Place a check 
mark on each of the scales wherever you feel the 
word should be rated. Work as fast as you can; 
don’t take too long to make any rating; and rate 
according to your first impression of the words. 
Don’t hesitate to use the extreme ends of the scales, 
wherever these seem appropriate. 

Here are some examples of the way you should do 
this task: If you were rating the word EXPRESS 
TRAIN and came to the scale “‘slow-fast,” you would 
probably consider an express train quite fast, and 
so you would place a check mark on the “fast” end 
of the “slow-fast’’ scale—perhaps here: 


EXPRESS TRAIN 


slow— = =— 8 SS 


/ 
fast 
or perhaps in the next space: 


EXPRESS TRAIN 


, AND R. B. WHITE 


Then you would go on to the next scale. Be sure 
that your mark is between the dots. 

If next you were rating the word STREETCAR, and 
came again to the “slow-fast” scale, you might fee] 
that a streetcar was only fairly fast and would 
check the scale here: 


STREETCAR 
slow—— : —— : — : —: a :—: 
——fast 
Then you go on to rate STREETCAR on the rest of 
the scales. If, however, you were rating the word 
OXCART on the “‘slow-fast”’ scale, you would probably 
consider it quite slow, and rate it here: 








OXCART 
/ 
slow—— : —— : —— : ——: ——-:—: 
fast 
or here: 
slow _— — Soe it 
——fasi 


Of course, you would make only one mark. 

Most of the ratings you are to make will not be 
so literal as these examples. For instance, rating the 
word OXCART, you might come to the scale “hot- 
cold.” 


OXCART 


hot—— : —— : —— : —— : ——- : —:3 
—cold 

There is no obvious “correct”? answer here—so 
rate it as you see it; does OXCART seem to you to be 
hot or cold or in between? Don’t expect the ratings 
to be literal. We want your impression of the words. 
In some cases you wonder how a certain scale can 
apply to the word you are rating, but we have found 
that you will be able to make the decisions quite 
easily if you follow instructions, rating quickly on the 
basis of first impressions. 


RESULTS 
The mean of the ratings of all subjects on 


each adjectival scale for each social position 


TABLE 1 
D Score Matrix 


Clusters Citizen 


Student .95 
Citizen 

Worker 

Adult 
Student in the Activities Program 
Worker in the Common Work Program 
Citizen in the Riggs Community 
Patient at Riggs 

Mental Patient 














Stu- | Worker | 

dent in the | Citizen 

in the | Com- | in the | Patient) Mental 

Worker) Adult | Activ- | mon Riggs at | Patient 

ities Work Com- Riggs 

Pro- Pro- | munity 
| 





gram gram | 








1.85 
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2.45 | 1. 1.84 | 2.57 | 4.45 | 5.79 
| 2.76 | 1. 1.79 | 1.60 | 2.34 | 4.33 | 5.70 
| 1.76 | 3.62 | 3.41 | 4.60 | 5.68 | 7.63 

| 232 | 2.14 | 2.94 | 4.90 | 6.39 

| .74| 1.35 | 3.16 | 4.54 
1.22 | 3.21 | 4.64 

| | 2.19 | 3.75 
2.08 
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Worker — Adult — Citizen — Student? 
Student in the Worker in the 

Activities Program ~~ Common Work Program 

Type III Patient at Riggs = Mental Patient 





Type I 
Type Il 


Staff: 

Type I 
Type II Citizen in the Student in the 
Riggs Community Activities Program ~ 

Type III Patient at Riggs = Mental Patient 


Neurotic Patients: 


Fic. 1. Clusters obtained for social positions available to adults. 


Citizen in the 
Riggs Community 


me — Student <— Citizen — Adult — Worker 


Worker in the 
~ Common Work Program 


Type I Worker — Adult — Citizen = Student 
Type II a Student inthe _. Worker in the 

Activities Program “~~ Common Work Program 
Type III Citizeninthe _. 


Riggs Community ~ 


Psychotic Patients: 
Type I Worker — Adult — Citizen « Student 
Type II Studentinthe _. Worker in the 


= Patient at Riggs — Mental Patient 


Activities Program ~~ Common Work Program 


Citizen in the 


Type II i 
pd “~ Riggs Community 


was calculated. From these mean scores a D 
score matrix, as described by Osgood and Suci 
(1952), was obtained for the group as a whole 
(Table 1). The D score is a measure of dissimi- 
larity between two variables. A D score of 
zero would indicate identical ratings on two 
positions; the larger the D score the less similar 
are the ratings between positions. 

Elementary linkage analysis, a rapid and 
objective method of clustering variables, 
developed by McQuitty (1957), was used to 
determine the clusters in the matrix of Table 1. 
Elementary linkage analysis categorizes vari- 
ables into types in such a way that any 
variable in one category is in some way more 
like some other variable in that category than 
it is like any variable not in that category. 
With McQuitty’s technique the clusters or 
typal structures of Figure 1 were obtained for 
social positions available to adults (in this 
analysis “baby,” “child,” “adolescent,” and 
“me” were omitted). 

A linkage analysis of the D scores of three 
groups—staff, neurotic patients, and psychotic 
patients—reveals the clusters of Figure 2. 

Note that the staff rates the word Me most 
like the way they rate usual adult positions. 





*A double arrow indicates that each variable is 
most closely associated with the other. A single arrow 
indicates that the variable at the tail of the arrow is 
most closely associated with the one at the head, but 
the one at the head is not most closely associated with 
the one at the tail. 


+ Patient at Riggs «— Mental Patient 





Fic. 2. Clusters obtained for staff, neurotic patients, and psychotic patients. 


Neurotic patients rate Me most like the 
Hospital-Created positions. Psychotic pa- 
tients rate Me in the cluster with Mental 
Patient. 

Patients who are ready to leave the hospital 
to return to society at large should have a 
self-conception congruent with the demands 
associated with usual adult positions. Those 
whose ego organization is so fragmented that 
they are unable to adapt to an open hospital 
environment may be expected to have a 
self-conception congruent with positions con- 
sidered rather regressed, such as that of 
mental patient. Three patients who were 
discharged within 2 months after testing and 
returned successfully to college, and three 
patients who were unable to adapt to the 
demands of the hospital (within 2 months of 
testing two were transferred to a closed 
psychiatric hospital and one committed 
suicide) were selected for separate analysis of 
their ratings. The typal analysis of the two 
D score matrices derived from the mean 
scores on all scales for the two groups—the 
Successful Group and the Unsuccessful 
Group—reveals the data in Figure 3. 

For the Successful Group the word Me is 
placed in the cluster with the usual adult 
positions, and is rated most like Student in the 
Activities Program with Student rated most 
like Me. In addition, there is a separate cluster 
made up of positions in the hospital these 
patients are about to leave. The Unsuccessful 
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Successful Group 
Sitent <o ne = Student inthe |. 
Stace ~ ~~ Activities Program 


Worker in the 


Type I 
Patient at Riggs 

Type Il : 
Mental patient 

Unsuccessful Group 


Type I Worker 
Type I Student in the 
Activities Program * 


+ Mental Patient — 


» Adult — Citizen — Student 
Worker in the 


Citizen in the 


Type III » ; ‘ . 
lype IIL me Riggs Community 


Citizen < 


* Common Work Program “ 


Common Work Program 


< 


Adult — Worker 


Citizen in the 
Riggs Community 


Patient at Riggs 


Fic. 3. Clusters obtained for the successful and the unsuccessful groups 


Type A (necessary-unnecessary ) 
Type B (kind-cruel) — (therapeutic-untherapeutic) 


| 
(excitable-calm) — (beautiful-ugly) 


+ (important-unimportant) ; 


* (real-unreal) 


* (good-bad) 


(active-passive) 


Type C (responsible-irresponsible) — (strong-weak) — (hard-soft) 


masculine-feminine) < 


* (fast-slow) « 


(well-sick) 


Fic. 4. Linkage analysis for all subjects. 


rABLE 2 
COORDINATES OF ADJECTIVAL SCALES IN 
THE THREE TYPES 





| 
| 


Scales Coordinate values 
Type A 
Important-Unimportant 6.89 
Necessary-Unnecessary 6.45 
Real-Unreal 6.37 
Type B 


Good Bad 
Active-Passive 
rherapeutic-Untherapeutic 
Kind-Cruel 
Beautiful-Ugly 
Excitable-Calm 

Type C 
Responsible-Irresponsible 
Strong-Weak 
Hard-Soft 
Masculine-Feminine 
Fast-Slow 
Well-Sick 
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Group rates Me most as they do Mental 
Patient. 

In order to determine what dimensions were 
present in the adjectival scales, a D score 
matrix based on the data from all subjects 
was calculated which compared each adjectival 
scale with every other one. 

The elementary linkage analysis of this 
matrix revealed three types (Figure 4). 

In addition D® factor analyses (Osgood, 
Suci, & Tannenbaum, 1957, pp. 332-335) 
were performed on the means obtained on each 
adjectival scale in each type in order to get 


TABLE 3 
MEANS OF RESPONSIBILITY DIMENSION 
FOR SOCIAL PoOsITIONS* 


| pa- | Patients and 


Socia osition Staff staf 
a | — [tents | aia 
a so 
Usual Adult positions 
Student | 
Worker ¢} 2.81 | 3.03 2.92 
Citizen 


| 
Adult | 2.71 | 3.10 | 2.90 
Hospital-Created positions | 
Worker in the Common 
Work Program 1 
Student in the 
Activities Program 
Citizen in the Riggs | 


| 
3.45 | 3.86 | 3.65 
Community }} | 


Adolescent 3.61 | 3.95 3.78 
Child | 4.03 | 3.97 4.0 
Mental Patient positions : 
" . | 
Patient at Riggs I 6.20 | 4.82 | 4.63 
Mental! Patient | 
Baby | 4.98 | 5.25 | 5.11 
] 


® Score of 1 refers to maximum responsibility; a score of 7 to 
minimum responsibility. 

© An analysis of variance of responsibility means for Usual 
Adult positions, Hospital-Created positions, and Mental Patient 


positions is significant beyond the .001 level 


the coordinates. The variables in each type 
formed separate matrices and D?* factor 
analyses were done with each matrix. Co- 
ordinates are analogous to factor loadings: 
the higher the coordinate value of a variable 
in each typal structure, the more closely 
related is that variable with the type. As 
shown in Table 2, Type A has equally high 
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coordinate values for the three scales im- 
portant-unimportant, real-unreal, and neces- 
sary-unnecessary. Type B is most highly 
loaded by the scales good-bad, active-passive, 
and therapeutic-untherapeutic. The scales 
responsible-irresponsible, strong-weak, and 
hard-soft have the highest coordinate values 
in Type C, which we tentatively call a re- 
sponsibility dimension. 

It is reasonable to assume that the degree of 
responsibility that is expected of a person is in 
some measure directly related to his age level. 
Does the responsibility dimension found in 
the factor analysis show increasing responsi- 
bility associated with more mature ages? 
Table 3 indicates that the scores on the re- 
sponsibility dimension for adult, adolescent, 
child, and baby are in the predicted direction, 
As Table 3 also shows, the similarity in 
ratings of usual adult positions with the rating 
of the word adult, the similarity in the ratings 
of Hospital-Created positions with those of 
the words adolescent and child, and the 
similarity between ratings of mental patient 
positions and baby are striking. The difference 
in the responses to the three categories of 
positions is statistically significant by F test 
beyond the .001 level. 


DISCUSSION 


The three usual adult positions (Worker, 
Citizen, Student) elicit very similar responses 
from both patients and personnel. The re- 
sponses of patients and staff for the Hospital- 
Created positions (Worker in the Common 
Work Program, Student in the Activities 
Program, Citizen in the Riggs Community) 
were also very similar to each other. A similar 
cluster emerges from ratings relating to the 
mental patient positions (Patient at Riggs, 
Mental Patient). From these findings we may 
infer that there are similar kinds of expecta- 
tions, characteristics, and valuations associ- 
ated with positions in each cluster. Those 
associated with usual adult positions differ less 
from those associated with Hospital-Created 
positions than they differ from those con- 
nected with being a mental patient. From the 
responses on these adjectival scales we may 
infer some measure of the role demands made 
upon persons filling the various positions 
studied. The role demands made upon Riggs 
patients (along the dimensions measured) seem 


to fall somewhere between those made upon 
adults outside of a hospital and those made 
upon persons considered to be mental patients. 

Are the role demands associated with the 
Hospital-Created positions congruent with the 
conception our patients have of their own 
capacity to meet these demands? Neurotic 
patients rate the word Me much as they rate 
Worker in the Common Work Program and 
Student in the Activities Program—positions 
which are near-adult. Psychotic patients place 
Me in the cluster with Mental Patient posi- 
tions, rating it most like Citizen in the Riggs 
Community. We may infer that the self-con- 
ception of neurotic patients is more nearly 
adult than is that of psychotic patients. Both 
patient groups place Citizen in the Riggs 
Community in the cluster with Mental Patient 
positions. The minimal requirements of a 
citizen in the Riggs Community, besides the 
work expectation, are little more than attend- 
ance at regular group meetings, occasionally 
taking one’s turn at preparing snacks for the 
patient group, and abiding by the accepted 
standards of behavior of the hospital. In other 
words, a fairly passive adjustment to the 
community is sufficient to be an acceptable 
citizen of the hospital. The position of citizen 
can, however, be filled in a very active, cre- 
ative, and adaptive manner—the role has a 
wide range of behaviors that would satis- 
factorily fit it. The psychotic patients, who 
respond to Me as they respond to Citizen in 
the Riggs Community, seem to view them- 
selves as capable of meeting some measure of 
social demand, but not fully capable of meet- 
ing the demands that they associate with the 
other Hospital-Created positions. 

Concerning the reciprocal relation between 
patient and hospital community, it might be 
helpful to note the process of mutual selection 
that results from the admission policy and 
from the voluntary nature of the hospital. 
Only those patients who are judged capable of 
adapting to an open hospital are admitted. 
Also, since patients enter and remain in the 
hospital voluntarily, they, in turn, select it as 
the place that fits them. The correspondence 
between the social system of a voluntary 
hospital and the degree of ego intactness of 
its patients is due, in large measure, to this 
process of mutual selection. 

There are occasions when the ego organiza- 
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tion of a patient proves to have become so 
well integrated or so fragmented that patient 
and hospital no longer fit each other. In fact, 
when treatment is successful, integration of 
ego functions is expected to develop to a 
degree that the role opportunities within the 
hospital are too limited to dovetail satis- 
factorily with newly developed adult self- 
conceptions. Thus, the person must leave the 
hospital to be able to go to school, work, 
marry, etc., where the social structure of the 
community allows a further range of role 
opportunities to correspond with the new 
self-conception. 

From the analysis of the D score matrix of 
three patients who demonstrated a definite 
reintegration of ego functions by their readiness 
to return to college (among other indications), 
it is clear that their self-conception was most 
congruent with their conception of a student 
in or out of the hospital. Since these patients 
were tested shortly before they left the hos- 
pital, the results may reflect a point in a 
changing self-conception from being a student 
in a hospital to being a student in college. 
Such a result also supports the consideration 
that realistic congruence of one’s self-concep- 
tion with usual adult roles cannot occur while 
a person is hospitalized, 

When a patient becomes manifestly psy- 
chotic or for other reasons becomes untreat- 
able, and begins to be unable to meet even 
minimal demands and expectations, the social 
demands are softened and eased temporarily, 
but never given up entirely. If, despite this 
easing, he is still unable to meet the com- 
munity’s expectations, administrative mecha- 
nisms are set in motion to transfer him out of 
the hospital. This is done partly because his 
deviance threatens the hospital social struc- 
ture, and partly because he himself cannot 
benefit from the treatment offered if he does 
not also have a minimally adequate role in the 
social setting. Three patients who were un- 
treatable in the open hospital where the study 
was conducted had _ self-conceptions most 
congruent with that of being a mental patient. 
Their inability to adapt to the hospital com- 
munity coincided with a conception of them- 
selves (not necessarily entirely in awareness) 
as rather regressed and mentally ill. 

In this study we focused on only one sig- 


nificant dimension of role demand—that 
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related to responsibility. A decreasing degree 
of responsibility is expected from persons 
filling each group of positions from adult 
positions to Hospital-Created positions down 
to Mental Patient positions. If a person js 
thought of as a mental patient, he seems to be 
considered no more responsible than a baby, 
If he is an active participant in the various 
hospital programs, he is considered moderately 
responsible—about as responsible as an adoles- 
cent. Only when occupying certain active 
nonpatient positions is a person considered to 
be as responsible as an adult. We can infer 
from the data also that filling some kinds of 
positions excludes a person from other posi- 
tions. For example, to be an irresponsible 
mental patient excludes one from the possi- 
bility of occupying such adult positions as 
worker or student; on the other hand, if one is 
occupying such adult positions, it is unlikely 
that he would be considered mentally ill. 
Therefore, if a person is considered to be a 
mental patient, and the role demands upon 
him are to be a mental patient exclusively—he 
irresponsible, be “crazy,” etc.—he has no 
opportunity to explore or learn to cope with 
more adult demands. The Hospital-Created 
positions appear to provide near-adult de- 
mands, expectations of responsibility, and 
opportunities to function as a valued member 
of the community. 

If a patient works in some of the hospital 
programs, especially the Common Work 
Program, he seems to be considered as ap- 
proaching the status of adult; that is, he is not 
merely a mental patient, but someone capable 
of assuming responsibility. Failure to partici- 
pate in the Common Work Program does not 
remove the role demand that he work as long 
as he is a member of the community. If there 
are sufficient intact ego functions correspond- 
ing with the demand, they will be supported 
by it, and the patient will be able to perform 
in that work situation. Without the social 
demand, sufficient internal forces would be 
less likely to be mobilized in the hospitalized 
patient to enable him to participate in a 
consistent and sustained way. 

The results of this study confirm the im- 
pression gained from clinical observations 
that the developments, thus far, in_ this 
hospital community have provided role 
opportunities for patients that allow and 
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support behavior that is socially responsible, 
competent, and valued. The kind of hospital 
ganization described above seems to ap- 
roach the goal of providing a subsociety 
sithin which a temporary functional balance 
may exist between a patient’s ego organization 
and the social structure of the hospital. As 
reorganizations occur in the ego capacities of 
the person, movement from the hospital 
society to society at large is possible without 
vast readjustment in the patient’s self-con- 
ception or utter unpreparedness for the social 
demands that he will encounter on leaving 
the hospital. 


SUMMARY 


A study of role demand and self-conception 
using the semantic differential technique was 
conducted in a psychiatric hospital organized 
as a therapeutic community. Both patients 
and staff clearly differentiated three cate- 
gories of social positions: usual adult social 
positions, positions unique for the hospital 
studied, and those related to being a mental 
patient. Staff members’ rating of the word Me 
was most similar to their ratings of usual adult 
positions; neurotic patients’ rating of Me was 
most similar to their ratings of Hospital- 
Created positions; rating of Me by psychotic 
patients was most similar to their ratings of a 
cluster of positions that included Mental 
Patient. From these results we infer that the 
self-conception of staff members is most 
congruent with the role demands of usual 
adult positions, the self-conception of neurotic 
patients most congruent with the role demands 
of the positions unique for the hospital, and 
that of psychotic patients most congruent with 
the role demands associated with being a men- 
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tal patient. On a responsibility dimension 
there were striking similarities between the 
ratings of usual adult positions and adult, 
between the ratings of Hospital-Created 
positions and adolescent and child, and 
between the ratings of Mental Patient posi- 
tions and baby. 

The results were discussed in relation to the 
reciprocation between the degree of the pa- 
tients’ ego intactness and the social structure 
of the hospital studied. 
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TUDIES in communication research have 
reached no consensus as to the effect of 
order of presentation. The pro-con vs. 

con-pro paradigm leads sometimes to primacy 
effects (greater effect of the first communica- 
tion effects. Al- 
though some progress has been made in the 
study of the relevant variables (Anderson, 
1959; Hovland et al, 1957a; Luchins, 1958; 
Miller & Campbell, 1959), the situation is far 
from clear. 

One limitation of most work 
number, typically two, of communications 
involved. Two experiments (Anderson, 1959; 
Weld & Roff, 1938) have used reasonably long 
sequences of communications on a single topic. 


and sometimes to recency 


is the small 


However, no work has been done to assess 
practice effects over a sequence of communica- 
tions on separate topics. Both lines of attack 
are desirable, not only because they simulate 
everyday situations more closely, but also for 
theoretical Thus, Hovland (1957b) 
has emphasized some of the ways in which 
prior familiarity with the topic might influ- 


reasons. 


ence reception and acceptance of the com- 
munications. Analogous considerations would 
apply when a sequence of different topics is 
used, and various practice effects might also 
be important. The two cited paradigms would 
thus be expected to be useful in bringing the 
relevant psychological processes under closer 
experimental scrutiny. 

rhe present experiment was designed to 
study order effects over a sequence of com- 
munications on separate issues. The classic 
paper of Asch (1946) suggested the use of 
personality adjectives in order to get a large 
body relatively homogeneous material. 
Asch’s those of Luchins 
(1957, 1958), indicated that under the specific 
experimental conditions employed here, pri- 
macy effects would be obtained for the initia] 
trials. thought that 
continued practice the primacy effect would 


of 


results, well as 


as 


However, it was with 


decrease and pe rhaps become a recen Vy ¢ ffect 
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anD ALFRED A. BARRIOS 
alifornia, Los Angeles 
METHOD 
Subjects. Volunteers who were fulfilling a clas 


requirement in introductory psychology were assign¢ 
randomly to the various conditions within each exper 
Sex balanced in Experiment 1 but not ip 
Experiment 2. The Ns ! experiments wer 


he two 
64 and 24, respectively 


ment was 
for t 
Subjects were examined 
dividually 
Stimuli. A 


personality characteristics were picked on the basis 


set of 328 adjectives descriptive 
familiarity by the experimenter and were then given 
rough scaling based on the responses of 10 judges. 7 
judges rated each adjective on a six-point Favorabk 
Unfavorable scale which had an additional “‘X”’ rating 
The 25 adjectives that receive 
The mean ratings of the remaining 
with a rating of 5 
The 48 
the 48 adiject 


lable 2 shows some 


for unfamiliar words 
an X were not used 
adjectives ranged from 0.1 to 4.8, 
corresponding to Very Favorable adjectives 
the range 4.0-4.3 will be called H: 
the 2-2.5 will be called L 
of the adjectives that were used 
Procedure. Sul jects were told that they would be 
read a number of sets of adjectives, each set describing 
a different person, and that they should try to form an 
the 
instructions indicated that 


ives 


range 1 


impression of the kind of adjectives ae 
scribed. The 
think of the six adjectives as having been given by six 
After the 


person 
they were t 
different people who knew that person well 
experimenter read each set, the subject indicated his 
response numerically in terms of a rating scale type 
on a card in front of him 
from +4 to —4, the neutral response being disallowe 

An identifying label was placed by each number using 


This eight-step scale range 


the various combinations of Favorable and Unfavor 
able, with Highly, Considerably, Moderately, an 
Slightly as modifiers. 

Experiment 1. Each subject judged 61 or 62 sets 
six adjectives each. One of these sets was the same as 
that used by Asch (1946, Experiment VI). It was use 
as the first set for half the subjects, and as the last set 
for the other half. Within each of these halves, the set 
was given in forward order to half the subjects and 
to the other half 
received this set as their first set also received it as their 


reverse order Those subjects wh 
last set 

The remaining sets were of five types. Type HI 
consisted of three adjectives in the H range follows 
by three adjectives in the L range of scale values. Type 
LH had three L adjectives followed by three H adje 
tives GD was a t 
gradually « from H to 1 
GA gra to H. Type R 


of six adjectives chosen similarly to the ot 


Cype 


lescending 


sequence of 
in scale value. lype 


lually ascended from I 





arranged in random order within a set. 1 


each type were constructed randomly subj 


restriction that no adjective appear twice in any 


and that each adjective be used about equally ofte 
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ION 


Six ordered blocks of 10 sets each were constructed 





subject to the restriction that two sets of 
h type appear in each block. Four sequences of 60 


ts were obtained using four different permutations of 
e six blocks. From these four sequences, four addi 





nal sequences were formed by reversing the order of 
the adjectives in the 60 sets. The GD, GA, and R sets 
re completely reversed in order. However, the H and 
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@ class ere cO 
assigned | | subsets in the HL and LH sets were simply inter- 
1 experi hanged without disturbing the order of the three 
t not in sdiectives within either subset. 
its were The adjectives of each set were read by the experi 
ined ip nenter at an approximate rate of one adjective each 3 
seconds. Three trials were given each minute until all 
tive the sets had been judged 
basis of Experiment 2. Each subject judged 90 sets of two 
given’ | adiectives each, 30 HL, 30 LH, 15 LL, and 15 HH. 
“a " These sets were constructed as in Experiment 1 and 
— vanced for type in ordered blocks of 30 sets each. 
| a Half the subjects received the sets in one sequence; the 
eceiver ; , . 
naining ther half received the same sequence but with the 
ng of § rder of the two adjectives in the LH and in the HL 
tives in sets reversed. 
tives in [he main independent variable was the time between 
$s some the reading of the first and second adjectives of a set 
Each subject judged one block of 30 sets at each of 
uld be three time intervals: 0, 2, and 4 seconds. All six possible 
cribing { sequences of time intervals were used in a latin square 
orm an esign. Successive blocks were separated by 1 minute 
es ce juring which the subject was told that a new tempo 
rene © ild be used for the following block. 
DV Si 
ter th 
ed his 
typed t 
range a 
lowed . 
- using 
lavor } - = 
an v 
sets 0 ‘ 
me as oe 
s used ™ Female: GD-GA 
+ ont 
. a ~ * Female: HL-LH 
ind in ¥: Gus 
; who 2 | 
; their y 
e HI 2 iL x Male: GD-GA 
e < 
we | ne Male: HL-LH 
tive a 
Typ 
we Ly 
: | | 
the 2 3 
ip 20 TRIAL BLOCKS 
n Fic. 1. Mean primacy scores as a function of sex, 
pe of set, and trial blocks. (Experiment 1.) 








rABLE 1 
ANALYSIS OF VARIANCE OF ORDER EFFECT 
Scores SUMMED OVER ALL TRIALS 
Source if F 
| 
Mean 1 48 .38* 
Sex 1 | 2.23 
Error (b) 62 (183.81)* 
Type 1 .02 
Sex X Type | 1 8.03* 
Error (w) } 62 (83.77) 


Note.—Error mean squares in parentheses 
. 


’ < .05 


RESULTS 

Experiment 1. If an HL set produces a 
higher response than the corresponding LH 
set, then the first subset of three adjectives 
had a stronger effect than the second subset, 
within at least one of the two sets. Positive 
HL-LH differences, and positive GD-GA 
differences thus represent primacy effects. 
These two difference scores were computed 
for each subject using the several adjectives of 
each type in a given block of trials. 

The results are shown in Figure 1 which 
plots mean difference scores for the four main 
experimental conditions as a function of 
trial blocks. It is seen that there is a strong 
primacy effect which equals 0.69 averaged 
over all conditions. Although the effect de- 
creases somewhat over trials, it not 
appear to be approaching zero. Females show 
more primacy than males but this sex difference 
resides largely in the HL-LH sets. Indeed, the 
sexes are not too far apart on the GD-GA 
curves and, compared to these, the males are 
lower and the females are higher on the HL- 
LH curves. 

The analysis of these data was performed 
on the difference scores summed over all 60 
trials and is given in Table 1. Since a differ- 
ence score was used, the F for Mean shows a 
significant primacy effect. Although the main 
effect of Sex is not significant, the significant 
Sex X Type interaction verifies the comments 
on sex differences made in the preceding 
paragraph. A trend showed that the 
decline of the primacy effect with trials was 
12.27, df = 1/60). 


does 


test 


significant (F = 


The absolute responses, as distinguished 
from the difference scores, are also of interest. 
The mean response over all descending sets 
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TABLE 2 
MEAN RESPONSE TO Stx SETS SHOWING STRONGEST PRIMACY EFFECTS 

Type Words Descending | Ascending Mean 

rde orde prima 
GD smart, artistic, sentimental, cool, awkward, faultfinding 1.38 —0.72 2.10 
HL determined, tolerant, gentle, stubborn, forgetful, tricky 1.62 —0.16 1.78 
GD co dJerly, entertaining, humble, cool, calculating, moody 1.84 0.44 1.40 
HL efficient, scholarly, smart, crafty, faultfinding, unruly 0.66 —0.66 1.32 
GD warm, independent, optimistic, reserved, tough, faultfinding 1.69 0.41 1.28 
HL patriotic, perfective, generous, softhearted, uninhibited, humorless 1.84 0.59 1.25 


(HL and GD) was 0.94, and the mean re- 
sponse over all ascending sets (LH and GA) 
was 0.25. This latter value is quite close to 
the mean response of 0.32 for the R sets. 
Since the adjectives for the R sets were from 
the same pool as for the other sets, the asym- 
metry of these results suggests that the pri- 
macy effect has its source in the initial ad- 
jectives of the descending sets. However, this 
suggestion must be viewed with caution 
since, although the HL-LH and GD-GA order 
effects were almost identical, the mean response 
to GA and LH sets was 0.48 and 0.02, respec- 
tively. 

Results from the set used by Asch (1946) 
were consistent with the above. When this set 
was given first, the means were 0.87 and 
—0,.94 for the GD and GA orders, respec- 
tively. The primacy effect of 1.81 was sig- 
nificant (F = 6.87, df = 1/30). The primacy 
effect was just half as large when the set was 
given last and this was nonsignificant, even 
when all 64 subjects were used. 

Of the 48 critical sets used, all but 5 gave a 
primacy effect. The descending versions of the 
6 sets which showed the greatest overall mean 
primacy effect are listed in Table 2. 

Experiment 2. The overall mean scores for 
the HH, HL, LH, and LL sets were 2.68, 0.80, 
0.76, and —1.27, respectively. The analysis of 
variance was performed on the HL-LH 
scores computed for the three successive 
blocks of 30 trials. Neither time interval nor 
trial blocks approached significance. The 
95% confidence interval for the overall mean 
order effect was 0.04 + 0.12. It thus seems 
reasonable to conclude that the true order 
effect is rather small when only two adjectives 
are used. Supporting evidence is found by 
comparing Error(b) to Error(w) which tests 
for individual differences. The resulting 





F (df = 17/36) was 1.19. Because of the great 


power of the F test with this many df, it 
seems reasonable to conclude that true in- 
dividual differences in the order effect are als 
quite small when only two adjectives are used 


DISCUSSION 


The primacy effect that was found is strik. 
ing. Over the first block of 10 trials, the mean 
difference in response obtained from simply 
reversing the order of the six adjectives was 
1.12. This is a respectable part of the eight- 
point scale, especially since few subjects used 
the full scale range. Moreover, the result is 
not due to some peculiarity of the particular 
sets used since they were constructed ran- 
domly and since a primacy effect was observed 
in 43 of the 48 sets. However, it is not clear 
how far the results may be generalized beyond 
the experimental situation used here. The 
need for some caution in this respect is in- 
dicated by the work of Luchins (1957, 1958 
who finds that the order effect changes with 
variations in the procedure. 

The decline in primacy over the later 
trials has a number of possible causes. It may 
be that, despite the use of the random sets, 
the pattern of good and bad words in the 
remaining sets brings about an_ increased 
tendency to take account of all the words in 
each set. This possibility could perhaps be 
tested by reducing the relative frequency of 
the critical sets. A progressive loss of interest 
in the task might also influence the primacy 
effect, a hypothesis which would be testable 
by inserting a rest and motivating instruc- 
tions partway through the session. A third 
possibility is that adaptation to the exper- 
mental situation and practice in integrating 
the material are the governing factors. If 50, 
the decrement in primacy found here indicates 
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nat caution should be used in generalizing 


results of studies employing only one or two 
mmunications. 

*he results of an earlier experiment (Ander- 
on, 1959) were interpreted as suggesting the 
existence of two opinion components. The 
basal component, once formed, was quite 


} sistant to change whereas the surface 
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omponent was readily influenced by the 
successive communications. The present data 
would be consistent with the two-component 
hypothesis if it were assumed that a strong 
basal component was formed over the first 
three but not the last three adjectives within 
a given set. Thus for an HL set, the surface 
component induced by the three H adjectives 
vould be approximately canceled by the 
surface component induced by the three L 
adiectives. However, the basal component 
induced by the three H adjectives would 
persist to produce a response toward the 
Favorable end of the scale. For the correspond- 
ing LH set, the basal component would tend 
to produce a response toward the Unfavorable 
end of the scale, and the difference between 
the two would constitute the primacy effect. 
The two-component hypothesis is thus not 
inconsistent with the present data but other 
explanations cannot be ruled out. In particu- 
lar, the primacy effect may result from a 
progressive decrease in attention over the 
adjectives of a given set. Attention decrement 
could presumably be reduced by more strin- 
gent control or at least tested by asking sub- 
jects to recall the adjectives (McGuire, 1957). 
It should also be noted that in the present 
situation the basal component would be 
equivalent in effect to and perhaps explainable 
in terms of Luchins’ (1957) Einstellung, or 
Asch’s (1946) concept of direction. However, 
a two-component interpretation suggests a 
test which the other two formulations do not. 
For consider the order effect paradigm defined 
by the two sets, HLH’L’ vs. HLL’H’, in 
which the first six adjectives are the same in 
both sets and the last six differ only in order 
of presentation. According to hypothesis, only 
the surface component is affected over the 
last six adjectives and hence (Anderson, 1959) 
the paradigm should yield a recency effect. 
Although this prediction must be considered 
rather speculative, the use of this and similar 


paradigms should prove valuable in the study 
of opinion and impression formation. 

Experiment 2 was designed on the assump- 
tion that a considerable crystallization of 
impression had occurred in the first half of the 
sets in Experiment 1. It was thought that the 
primacy effect would increase as increased 
time was allowed for the first adjective to sink 
in. Even though the reduction from six to two 
adjectives was expected to decrease the overall 
effect, the finding of negligible order effects in 
Experiment 2 came as a surprise in view of the 
strong primacy obtained in Experiment 1, 
The results of Experiment 2 thus give no 
evident support to Asch’s (1946) hypothesis 
that the first adjective sets up a directed 
impression in terms of which the later adjec- 
tives are interpreted. However, the combined 
results of the two experiments give some 
basis for speculating that the critical events 
leading to primacy in Experiment 1 occurred 
at the second and third adjectives. 

SUMMARY 

Sets of Favorable and Unfavorable adjec- 
tives descriptive of general personality char- 
acteristics were constructed so as to test for 
effect of order of presentation of the adjectives. 
The sets were read to subjects who were asked 
how favorable an impression they had of the 
person described by the set of adjectives. 

In Experiment 1, 64 subjects each judged 
some 60 sets of six adjectives each. Strong 
primacy effects were found although there 
was some decrement over trials. Females 
showed greater primacy than males for sets in 
which the change from Favorable to Un- 
favorable (and vice versa) was abrupt. How- 
ever, there was little sex difference when the 
change was gradual. 

In Experiment 2, 24 subjects each judged 
90 sets of two adjectives each, with intervals 
of 0, 2, and 4 seconds between the adjectives. 
Time interval had no observable effect and 
order of presentation was also nonsignificant. 
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SOCIAL DESIRABILITY OR ACQUIESCENCE IN THE MMPI? A CASE 
STUDY WITH THE SD SCALE! 


ALLEN L. EDWARDS 


University of Washineton 


EVERAL years ago I reported in a mono- 
graph (Edwards, 1957b) the results of 
research bearing upon the problem of 
the social desirability variable in personality 
assessment. I interpreted the results as sup- 
porting the hypothesis that scores on many 
personality scales of the True-False type could, 

n large part, be accounted for by individual 
lifferences in the tendency of subjects to give 
socially desirable responses to items in the 
sales. Part of the evidence bearing upon this 
hypothesis consisted of the correlations ob- 
tained between Minnesota Multiphasic Per- 
sonality Inventory (MMPI) scales and scores 
m another scale which I called the Social 
Desirability (SD) scale. 

It will be clear to anyone who reads the 
monograph that I in no way restricted the 
social desirability hypothesis to the MMPI 
sales—nor was the evidence confined only 
to the correlations between the MMPI scales 
and the SD scale. It is, however, my interpre- 
tation of scores on the SD scale as measuring 
the tendency to give socially desirable re- 
sponses and of the correlations of the SD scale 
with the MMPI scales as supporting the social 
desirability hypothesis that has been ques- 
tioned and it is for this reason that my paper 
is primarily concerned with the MMPI and 
SD scales. 

It has been suggested, for example, that, 
contrary to my interpretation, the SD scale 
is not a measure of the tendency to give 
socially desirable responses, but is instead a 
measure of the tendency to acquiesce, that 
is, of the tendency to give True responses to 
the items in the scale. It has also been sug- 
gested that many of the MMPI scales reflect 
the tendency to acquiesce and that the cor- 
relations between the SD scale and the MMPI 
scales can, therefore, be accounted for by 
ndividual differences in the tendency of sub- 
ects to give acquiescent responses to the items 


n the scales. 


: Presidential Address delivered to the Division of 
“valuation and Measurement, American Psychological 
‘sociation, September 1960. 


The question before us, then, is a controver- 
sial one. I propose to review, briefly, the his- 
tory of the controversy by first stating the 
argument for the social desirability hypothesis 
and then the argument for the acquiescence 
hypothesis. I shall then present some new 
evidence which bears upon these two alterna- 
tive hypotheses. 


SocrAL DesIRABILITY HYPOTHESIS 


Relationship between Probability of liem En- 
dorsement and Social Desirability Scale Value 

My first research (Edwards, 1953b) on the 
subject of social desirability was concerned 
with the relationship between the probability 
of endorsement of a personality statement and 
the social desirability scale value of the state- 
ment. This study is particularly relevant to 
the points I shall develop later with respect 
to both the acquiescence and social desirability 
hypotheses and I shall, therefore, describe the 
procedures I followed as well as the results. 

As is well known, psychological scaling 
methods have been used for many years to 
obtain scale values for statements of opinion. 
For example, if we have a set of statements of 
opinion relating to, say, capital punishment, 
we can have judges rate how favorable or 
unfavorable they believe each statement to 
be with respect to capital punishment. Psy- 
chological scaling methods can then be applied 
to the distributions of ratings to obtain a scale 
value for each statement. It is then possible 
to order the statements on a continuum ranging 
from highly unfavorable, through neutral, 
to highly favorable opinions. 

It was my belief that the same methods 
could be applied to obtain social desirability 
scale values for personality statements. Con- 
sequently, I asked judges to rate the degree 
of social undesirability or social desirability 
of the behavior, trait, or characteristic rep- 
resented by each personality statement in a 
set of 140. I then used the method of succes- 
sive intervals to obtain a social desirability 
scale value for each statement. The scale 
values order the statements on a continuum 
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ranging from highly socially undesirable, 
through neutral, to highly socially desirable 
characteristics. I call this continuum the 
social desirability continuum. 

The statements were then printed in the 
form of a personality inventory and adminis- 
tered to a new group of subjects under stand- 
ard instructions to describe themselves. For 
this group the proportion endorsing or answer- 
ing True to each statement was found. This 
proportion I refer to as the probability of item 
endorsement. If we now plot the probability 
of item endorsement against the social desir- 
ability scale value, we can see whether there 
is any relationship between the two variables. 

The relationship that I found is shown in 
Figure 1. The X or horizontal axis is the social 
desirability continuum, with the left end re- 
presenting statements with socially undesir- 
able scale values and the right end statements 
with socially desirable scale values. The Y 
or vertical axis is the probability of item en- 
dorsement. It is evident that the probability 
of item endorsement, or of a True response, 
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is a linear increasing function of the social de. 
sirability scale value. The more socially desir. 
able a statement is, the greater is the proba. 
bility of item endorsement. The product. 
moment correlation between the two variables 
is .87. 

I first presented the results shown in Figur 
1 at a meeting of the Western Psychologica 
Association in 1952. Since that time the re. 
search has been replicated many times (Cower 
& Tongas, 1959; Edwards, 1955, 1957a, 1959 
Hanley, 1956; Hillmer, 1958; Kenny, 1956 
Taylor, 1959; Wiggins & Rumrill, 1959 
Wright, 1957). These replications have ip- 
volved variations in the scaling method used 
to obtain the social desirability scale values, 
variations in the technique used to obtain the 
descriptions of subjects, variations in the set 
of personality items, and a range of different 
groups of subjects. In each instance the results 
have been consistent with those I originally 
reported. Thus, I believe it is possible to state 
that the relationship between probability of 
item endorsement and social desirability scale 
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yalue is a general phenomenon rather than an 
isolated result obtained under rather restricted 
conditions. 


Social Desirability Scale 


Consider now one way in which we might 
key a given set of personality statements. 
Suppose, for example, that if a statement has 
a socially desirable scale value we key the 
True response, whereas if the statement has 
a socially undesirable scale value we key the 
False response. The resulting scale would be 
of a kind that I have called a social desirability 
or SD scale. An SD scale, in other words, is 
one in which all of the items are keyed for 
socially desirable responses. A socially desirable 
response is defined as a True response to an 
item with a socially desirable scale value or a 
False response to an item with a socially 
undesirable scale value. The score on an SD 
scale is the number of socially desirable re- 
sponses that have been given to the set of 
items comprising the scale. 

The first SD scale I constructed (Edwards, 
1953a) consisted of 79 items from the MMPI. 
These 79 items were subsequently analyzed 
and reduced to the 39 items which best differ- 
entiated between a high and a low scoring 
group (Edwards, 1957b). There is evidence 
(Edwards, 1957b) to show that the results 
obtained with either the 79-item or the 39-item 
SD scale are comparable and the distinction 
between these two SD scales need not concern 
us. However, with but few exceptions, the 
research I shall be reporting upon was done 
with the 39-item SD scale. 

I have pointed out that all of the items in 
the SD scale are keyed for socially desirable 
responses. If we consider the question of what 
the SD scale is measuring, then, it seems to 
me, the simplest statement to be made is that 
it provides a measure of the tendency of sub- 
jects to give socially desirable responses in 
self-description under the standard instruc- 
tions ordinarily used with personality inven- 
tories. To determine the degree to which other 
personality scales might also be measuring 
this same tendency, I correlated scores on the 
item SD scale with a number of other 
personality scales (Edwards, 1953a). In gen- 
eral, the correlations I obtained with other 
personality scales of the True-False type were 
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of substantial magnitude. Furthermore, the 
signs of the correlations were in the direction 
predicted by the social desirability hypothesis. 
For example, if a high score on a given scale 
indicated a socially undesirable trait, that is, 
if it was necessary to make a large number of 
socially undesirable responses to obtain a high 
score, then the correlation of this scale with the 
SD scale was negative. On the other hand, if a 
high score on a given scale indicated a socially 
desirable trait, that is, if it was necessary to 
make a large number of socially desirable 
responses to obtain a high score, then the cor- 
relation of this scale with the SD scale was 
positive. These results indicated to me that if 
the SD scale was measuring the tendency to 
give socially desirable responses, as I believed 
it was, then the scores on the other scales I 
had investigated were, to a remarkable degree, 
reflecting the same tendency. 

Correlations subsequently obtained by 
Fordyce (1956) between the 79-item SD scale 
and the clinical and validity scales of the 
MMPI and by Edwards, Heathers, and 
Fordyce (1960) between the 39-item SD 
scale and 11 “new”? MMPI scales, described 
by Hathaway and Briggs (1957), are also con- 
sistent with the social desirability hypothesis. 


ACQUTESCENCE HyPpoTHESIS 


Two articles by Cronbach (1946, 1950) on 
response sets have done much to arouse in- 
terest in the particular response set which 
Cronbach called, as had Lentz (1938) before 
him, acquiescence. Acquiescence, according to 
Cronbach, refers to the tendency of a subject 
to agree with or respond True to a test item 
when he is in doubt as to the appropriate or 
correct response. It has been suggested that 
items in personality inventories are of a kind 
likely to evoke doubts as to the appropriate 
response since subjects are required to inter- 
pret the content of the item along with the 
meaning of such terms as few, seldom, fre- 
quently, occasionally, sometimes, and often 
(Allport, 1937; Benton, 1935; Watson, 1959). 
If the interpretative ambiguity of a personality 
item does produce doubts as to the appropriate 
response, then the item is also likely to evoke 
acquiescent tendencies. Thus, if we have a 
personality scale in which a majority of the 
items are keyed False, and if these items also 
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arouse doubts as to the appropriate response, 
then it follows that a highly acquiescent sub- 
ject is apt to respond True to these keyed 
False items. The highly acquiescent subject, 
in other words, may be expected to obtain 
a lower score on the scale than he otherwise 
would if acquiescence were not involved. Thus, 
instead of measuring the trait we think the 
scale is measuring, the scores may be reflecting 
the differing acquiescent tendencies of the 
subjects. 


ACQUIESCENCE AND THE SD SCALE 


The argument for the acquiescence hypoth- 
esis, as I have presented it, can be applied 
to the 39-item SD scale in which 30 of the 
39 keyed socially desirable responses are False. 
Assuming the argument to be valid, we would 
expect the more acquiescent subjects to ob- 
tain low scores and the less acquiescent sub- 
jects to obtain high scores on the scale. It is a 
question of some importance, therefore, to 
ask whether a high score on the SD scale 
is primarily a measure of the tendency to give 
socially desirable responses or of the tendency 
to be nonacquiescent. I was quite aware of this 
possible double interpretation of scores on the 
SD scale, and my reasons for rejecting the 
acquiescence interpretation in favor of the 
social desirability interpretation are, I believe, 
clearly stated in my monograph. 

Let us look once again at Figure 1. Suppose 
we take only items from the socially desirable 
end of the continuum and key all of these 
items for the True response. Then a high 
score on this SD scale may be either a measure 
of a strong tendency to give socially desirable 
responses or of a strong tendency to be ac- 
quiescent or it may, of course, reflect both 
tendencies. If the original 39-item SD scale is 
primarily a measure of acquiescence and if this 
is also the case with the all-True SD scale, 
then, from the acquiescence point of view, 
scores on the two scales should correlate nega- 
tively, since the acquiescent subject will ob- 
tain a high score on the all-True SD scale and 
a low score on the 39-item SD scale in which, 
we recall, 30 of the 39 items are keyed False. 
On the other hand, if the original 39-item SD 
scale is primarily a measure of the tendency 
to give socially desirable responses and if this 
is also the case with the all-True SD scale, then 
the correlation between the two scales should 
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be positive. Thus, I believe the obtained cor. 
relation between these two SD scales would 
provide a basis for choosing between the two 
alternatives I have mentioned. In actual fact. 
the correlation was positive (.70) and consist. 
ent with the social desirability hypothesis, as 
I reported in my monograph. In addition to 
this correlation, I reported upon the correla- 
tions of the all-True SD scale—and another 
SD scale in which the True-False keying was 
balanced—with two MMPI the D 
or Depression scale in which a majority (67% 
of the items are keyed False and the Se or 
Schizophrenia scale in which a majority (78% 
of the items are keyed True. The correlations 
obtained were consistent with the 
desirability hypothesis, whereas this was not 
so in the case of the acquiescence hypothesis 
It was on the basis of these, and other, findings 
that I rejected the acquiescence interpretation 
of the SD scale in favor of the social desirability 
interpretation. 

At the time my monograph was in press, 
Fricke (1956) published an article in which he 
presented results which he interpreted as 
supporting the notion that the imbalance in 
the True-False keying of the various clinical 
and validity scales of the MMPI made them 
most susceptible to the acquiescence response 
set. In correspondence with me, since the pub- 
lication of my monograph, Fricke has ind- 
cated that he believes this is also the case 
with the SD scale. Wiggins (1959) has also 
taken the position that the imbalance in the 
keying of the 39-item SD scale results in it 
being a measure of acquiescence. Thus, Wig- 
gins believes that the various correlations 
that have been reported between the SD scale 
and the MMPI scales should be appropriately 
considered as evidence of the influence of 
acquiescence in the MMPI rather than of 
social desirability. He cites, in support of this 
belief, a negative correlation he obtained 
between a “response bias” scale, developed 
by Fricke (1957), and the SD scale. Jackson 
and Messick (1958), in their review of response 
sets, also state that the correlations between 
the SD scale and the MMPI scales may reflect 
acquiescence, since the SD scale contains 3 
disproportionate number of items keyed False. 

I have already cited certain evidence to 
indicate why I do not agree with the acquie> 
cence interpretation given the SD scale by 
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Fricke, Wiggins, Jackson, and Messick. Nor 
jo I share their apparent conviction that the 
imbalance in the True-False keying of the 
MMPI scales makes these scales susceptible 
to acquiescent tendencies. And, as I shall 
show later, the correlation between the SD 
sale and Fricke’s (1957) response bias scale 
is of the same sign and just about the same 
magnitude that would be predicted by the 
scial desirability hypothesis. What is at 
issue, then is quite clear. It is this: Can the 
correlations between the SD scale and the 
MMPI scales be more adequately accounted 
for in terms of the acquiescence hypothesis 
or the social desirability hypothesis? 


New EvIpDENCE BEARING UPON THE Two 
HYPOTHESES 
Acquiescence Hypothesis 

Before presenting some new evidence bear- 
upon these two alternative hypotheses 
and so that there may be no misunderstand- 
ing, let me summarize briefly the argument 
for the acquiescence hypothesis as it has been 
stated by Fricke, Wiggins, Jackson, and 
Messick. If a majority of the items in an 
MMPI scale are keyed False, as is also the 
ase with the SD scale, then the more acquies- 
cent subjects should obtain low scores on both 
scales, with the less acquiescent subjects ob- 
taining high scores on both scales, assuming, 
of course, that both scales are measuring 
acquiescence. Thus, the correlation between 
the SD scale and this MMPI scale, according 
to the acquiescence hypothesis, should be 
positive. On the other hand, if a majority of 
the items in an MMPI scale are keyed True, 
and if the acquiescence hypothesis is correct, 
then this scale and the SD scale should cor- 
relate negatively. 

Let us accept, for the moment, the state- 
ment of Fricke, Wiggins, Jackson, and Messick 
that it is possible to evaluate the role of ac- 
quiescence in a scale by considering the per- 
centage of items keyed True (or False) and 
using this percentage as an index of the ac- 
quiescence set. I have obtained this index 
for each of the 43 MMPI scales, that is, for 
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*The abbreviated code names for the 43 MMPI 
scales, as given by Dahlstrom and Welsh (1960), are 
as follows: L, F, K, Hs, D, D-S, D-O. Hy, Hy-S, Hy-O, 
Pd, Pd-S, Pd-O, Mf-m, Pa, Pa-S, Pa-O, Pt, Sc, Ma, 
Mo-S, Ma-O, Si, Dy, St, At, Ho, Pv, Do, Re, A, R, 
{d, Dn, Es, Eo, No, Nu, Pn, Cn, Ca, Ne, and B. 
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PROPORTION OF KEYED F ITEMS IN MMPI SCALE 

Fic. 2. Correlations of 43 MMPI scales with the 

SD scale as a function of the proportion of keyed False 
items in the MMPI scales. 


each scale I obtained the proportion of items 
keyed False. Figure 2 shows the SD correla- 
tions* plotted against the proportion of keyed 
False items in each of the 43 MMPI scales. 
It is clear that there is some tendency for the 
SD correlations with the 43 MMPI scales to 
be related to the imbalance in the True-False 
keying of the scales. The product-moment 
correlation between the two variables is .55. 
Thus, approximately 30% of the variation in 
the SD correlations can be accounted for by the 
imbalance in the True-False keying. 


Social Desirability Hypothesis 


Let us now see how well the social desira- 
bility hypothesis accounts for the same SD 
correlations. Knowing thedistribution of social 
desirability scale values of the items in a per- 
sonality test it is possible to classify the items 
in the scale in terms of whether the keyed re- 
sponse is a socially desirable or a socially unde- 
sirable response. According to the social desira- 
bility hypothesis, if the scale contains a large 
proportion of keyed socially desirable re- 
sponses, then the correlation between this scale 
and the SD scale should be positive. On the 
other hand, if the scale contains a large propor- 
tion of keyed socially undesirable responses, 


3 All of the correlations are with the 39-item SD 
scale. The correlations between the SD scale and the 
subtle and obvious scales of the MMPI are from an 
unpublished study by Fordyce and Rozynko reported 
by Edwards (1957b). The SD correlation with the B 
scale is reported by Wiggins (1959). The remaining 
correlations are based upon a sample originally tested 
by Merrill and Heathers (1956). 
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Fic. 3. Correlations of 43 MMPI scales with the SD 
scale as a function of the proportion of keyed socially 
desirable responses in the MMPI scales. 


then the correlation with the SD scale should 
be negative. If the keying of the items is rela- 
tively balanced for socially desirable and so- 
cially undesirable responses, then the correla- 
tion with the SD scale should be low. Social 
desirability scale values for the MMPI items 
have been obtained by Heineman (1953) and I 
have used these scale values to classify the 
items in each of the 43 MMPI scales in terms 
of whether or not the keyed response is a 
socially desirable or a socially undesirable one. 

I should point out that whether or not the 
keyed response is a socially desirable response 
is not necessarily a fixed characteristic of an 
MMPI item for all scales. For example, a 
given item may appear in more than one 
MMPI scale. In some scales the True response 
may be keyed and in other scales the False 
response may be keyed. Thus, in some scales 
a given item may be keyed for a socially de- 
sirable response and in other scales keyed 
for a socially undesirable response. This, of 
course was also the case when we considered 
the imbalance in the True-False keying of the 
MMPI items. 

Figure 3 shows the SD correlations plotted 
against the proportion of keyed socially de- 
sirable responses in the 43 MMPI scales.‘ 
There can be no doubt that we have a much 
closer relationship in this instance than when 
we plotted these same correlations against 
the proportion of keyed False items in the 


‘ Fricke’s response bias scale is located in Figure 3 
by the point with coordinates .40 and —.59 and is in 
accord with the trend for the other points shown in 
the figure. 
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scales. The product-moment correlation be- 
tween the two variables in Figure 3 is .92 
and we can account for approximately 85% 
of the variance in the SD correlations with the 
43 MMPI scales in terms of the proportion 
of keyed socially desirable responses in the 


scales. 
Partial and Multiple Correlations 


We recall that, by considering the propor- 
tion of keyed False items in the MMPI scales, 
we could account for only 30% of the variance 
in the SD correlations. Even this figure of 
30% is spuriously high, as we shall now see. 
Consider the relationship shown in Figure 4. 
The X or horizontal axis gives the proportion 
of keyed False items in each MMPI scale and 
the Y or vertical axis gives the proportion of 
keyed socially desirable responses for the 
scales. It is evident that there is some rela- 
tionship between the imbalance in the True- 
False keying for a scale and the imbalance 
in the social desirability keying for the same 
scale. In other words, those scales which tend 
to have a large proportion of keyed False 
items also tend to have a large proportion of 
keyed socially desirable responses. The prod- 
uct-moment correlation between the two sets 
of proportions shown in Figure 4 is .42. 

It is of interest, therefore, to consider what 
happens to the correlation of .55 between the 
proportion of keyed False responses in a scale 
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Fic. 4. Relationship between the proportion o 
items keyed for socially desirable responses in 4 
MMPI scales and the proportion of items keyed for 
False responses in the scales. 
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and the correlations with the SD scale, when 
we partial out the proportion of keyed socially 
desirable responses. As shown in Table 1, the 
partial correlation is .46. So, you see, if we 
hold constant or partial out the proportion of 
keyed socially desirable responses, the correla- 
tion between the proportion of keyed False 
responses and the SD correlations drops from 
55 to .46. On the other hand, if we obtain the 
partial correlation between the proportion of 
keyed socially desirable responses and the SD 
correlations, holding the proportion of keyed 
False responses constant, we find that this 
partial correlation is .91 and this value does 
not differ greatly from the original zero-order 
correlation of .92. It, thus, appears that the 
variation in the correlations of the MMPI 
scales with the SD scale can be more ade- 
quately accounted for by the fact that the 
MMPI scales vary in the degree to which 
they are keyed for socially desirable responses 
than by the fact that these same scales also 
vary in the degree to which they are keyed 
for False responses. 

We can examine the same point in a slightly 
different way by looking at the multiple cor- 
relation between the SD correlations and the 
proportion of keyed socially desirable responses 
and the proportion of keyed False responses. 
The zero-order correlation between the SD 
correlations and the proportion of keyed 
socially desirable responses in the MMPI 
scales is .92. The gain in efficiency by taking 
into account also the proportion of keyed 
False responses in the MMPI scales is very 
slight. The multiple correlation, for example, 
is only .94. Thus, the proportion of keyed 
False responses accounts for very little of the 
remaining variation in the SD correlations 
after we have considered the proportion of 
keyed socially desirable responses in the scales. 

On the basis of the evidence presented, I 
think it is possible to conclude that the im- 
balance in the True-False keying of the items 
in the MMPI scales is of relatively little im- 
portance compared with the imbalance in 
the social desirability keying in accounting 
for the correlations of the MMPI scales with 
the SD scale. 


CONCLUDING REMARKS 


In the way in which the acquiescence hy- 
pothesis has been applied to personality scales, 
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TABLE 1 

CORRELATION BETWEEN SD ScALE CORRELATIONS OF 
MMPI Scares AND ITEM PROPERTIES OF 

THE MMPI Scares 











Correlation | Freer io | eres" SD 

——— | | _— = 
Zero-order | 55 -92 
Partial 46 ‘91 
Multiple -94 





little attention has been given to the problem 
of what kinds of items are likely to evoke 
acquiescent tendencies. For example, in a 
recent paper, Couch and Keniston (1960) 
define acquiescence as the tendency to give 
Agree (or True) responses to items regardless 
of the item content. We have seen, however, 
that the probability of a True response to a 
personality item is an increasing linear func- 
tion of the social desirability scale value of the 
item. If acquiescence exists as a personality 
trait, in the sense defined by Couch and Kenis- 
ton, then the probability of a True response for 
an acquiescent subject must be greater than 
the corresponding probability for a nonacquies- 
cent subject for all items, regardless of the 
social desirability scale values of the items. In 
other words, we would have two regression 
lines for probability of item endorsement, one 
for acquiescent subjects and one for nonacqui- 
escent subjects, with that for acquiescent sub- 
jects having the greater Y intercept. This is an 
interesting possibility but one for which as 
yet there is no supporting evidence. 

Rather than regarding acquiescence as 
independent of item content, I would agree 
with Cronbach that response sets operate 
only when items are difficult or ambiguous. 
Response sets, as he points out, do not operate 
apart from the items which evoke them. It is 
only when an item is in some way unclear so 
that the subject is in doubt as to the appropri- 
ate or correct response that response sets be- 
come of importance. 

In an achievement test, for example, a 
difficult item is, by definition, one to which 
only a few subjects know the correct response. 
Thus, a difficult item can be said to evoke 
doubts as to the correct response among more 
subjects than will an easy item. Difficult 
items, in other words, provide greater oppor- 
tunities for acquiescent responses to occur 
than do easy items. If a test consists of a 
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series of very easy items, then we may expect 
few of the total responses to reflect acquiescent 
tendencies. As the difficulty level of the items 
increases, then we may expect an increasing 
number of responses to reflect acquiescent 
tendencies. 

Suppose, for example, we divide the items 
in an achievement test into those that are 
difficult and those that are easy. Then scores 
based only upon the easy items are apt to be 
relatively free from acquiescent tendencies, 
simply because easy items provide relatively 
few opportunities for acquiescent responses 
to occur. On the other hand, scores based only 
upon the difficult items may be very much 
influenced by acquiescent tendencies because 
these items provide many opportunities for 
acquiescent responses to occur. 

I believe the situation with respect to per- 
sonality tests is somewhat similar. Scores on 
personality tests will be influenced by acquies- 
cent tendencies only to the degree to which the 
test contains items which are difficult or am- 
biguous and, thus, provide opportunities for 
acquiescent responses to occur. But we must 
consider not only the interaction between 
the items and acquiescent tendencies but also 
the interaction between the items and social 
desirability tendencies. There is evidence to 
indicate that the tendency to give socially 
desirable responses is so powerful, so dominant, 
that if it is evoked by an item, then acquies- 
cent tendencies are of little importance. The 
kinds of items which evoke the social desira- 
bility set are those with socially desirable or 
socially undesirable scale values. I would 
assume, therefore, that if a personality test 
contains mainly items with socially desirable 
or socially undesirable scale values, scores 
on the test will be little influenced by acquies- 
cent tendencies. With neutral items the socially 
desirable response is not obvious and the 
social desirability set cannot operate in the 
same way in which it does with nonneutral 
items. I would assume, therefore, that re- 
sponses to neutral items may be much more 
influenced by acquiescent tendencies than 
responses to nonneutral items. 

If I am correct in this analysis, then it is the 
neutral items in a personality test which will 
provide opportunities for acquiescent tend- 
encies to influence scores on the test. If the 
neutral items are consistently keyed for True 
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responses, then we may expect the more 
acquiescent subjects to obtain higher scores 
on the test than nonacquiescent subjects, 
If the neutral items are balanced in their 
True-False keying, then we may expect 
acquiescent tendencies to have little influence 
on the total test scores. 

If most of the items in a test have socially 
desirable or socially undesirable scale values 
and only a few items have neutral social desira- 
bility scale values, then acquiescent tendencies 
should have relatively little influence upon the 
total test scores. An examination of many per- 
sonality scales leads me to suggest that this 
is most often the case. Very few of these scales 
contain many neutral items in relation to the 
total number of items in the scale. Under this 
condition, we may expect scores on a scale 
to be relatively little influenced by acquiescent 
tendencies, regardless of whether or not there 
is an imbalance in the True-False keying of 
the items. The SD scale, in which all of the 
items have socially desirable or socially un- 
desirable scale values is, I believe, a case in 
point. 

If most of the items in a personality scale 
have nonneutral social desirability scale values, 
then the nature of the social desirability keying 
of the items is of importance not only for an 
evaluation of the social desirability set but also 
for an evaluation of the acquiescence set. Ac- 
quiescence, for example, has sometimes been in- 
vestigated by obtaining the correlation be 
tween the separate scores on the keyed True 
and the keyed False items in a scale. A nega- 
tive or low correlation between the scores 
based upon the keyed True items and the 
keyed False items has been interpreted as 
evidence of the influence of acquiescence. I 
wish to emphasize, however, that this is not 
necessarily the case. For example, the social 
desirability hypothesis would predict a nega- 
tive correlation between the part scores if the 
True items are consistently keyed for socially 
desirable responses and the False items for 
socially undesirable responses, or vice versa; 
a low correlation if the True items are consist- 
ently keyed for socially desirable or socially 
undesirable responses and if there is a balance 
in the social desirability keying of the False 
items, or vice versa; and a positive correla- 
tion between the part scores if the items in 
both sets are consistently keyed for either 
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socially desirable or socially undesirable re- 
sponses. Thus, what may on the surface ap- 
pear to be the influence of the acquiescence 
may, upon closer examination, be the result 
of social desirability. 

Finally, I should like to say that for the 
reasons I have given in my address, I regard 
the attempt to evaluate acquiescent tendencies 
in terms of an imbalance in the True-False 
keying of all of the items in a test as completely 
inadequate. To determine the imbalance in 
the True-False keying is a very easy approach 
to the problem, but one which overlooks the 
complexity of the problem. 
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THE MOTHER-CHILD RELATIONSHIP IN BRONCHIAL ASTHMA: 


MARVIN MARGOLIS? 


Michigan State University 


INCE the publication of French and 

Alexander’s (1941) now classic mono- 

graph, “Psychogenic Factors in Bron- 
chial Asthma,” the mother-child relationship 
has been increasingly implicated as an etio- 
logical factor in the genesis of childhood 
asthma. French and Alexander conceived ex- 
cessive dependence on and longing for the 
mother as being at the very core of the psycho- 
dynamics of asthma. This fear of losing ma- 
ternal love may arise from the threat of an 
actual separation or from the patient’s anxiety 
that the exposure of his aggressive and sexual 
fantasies would evoke the mother’s rejection. 
French and Alexander reported that most of 
their cases were characterized by marked and 
early maternal rejection. 

Numerous clinical reports have since con- 
curred with this formulation (Coolidge, 1956; 
Dunbar, 1938; Gerard, 1946; Jessner, Lamont, 
Long, Rollins, Whipple, & Prentice, 1955; 
Monsour, 1960; Sperling, 1949), suggesting 
that the mothers of asthmatics often carry 
into their relationship with their children their 
own unresolved childhood conflicts. Sperling 
(1949) noted that these mothers have an un- 
conscious need to keep their asthmatic off- 
spring in a helpless and dependent state to a 
degree found only among mothers of psychotic 
children. Monsour (1960) was so impressed 
with the extent of the psychological conflicts 
of these mothers that he stated that “treatment 
of the mother seems inescapable in the cases of 
young asthmatic children.” 

There have been few controlled, objective, 
experimental studies of the mothers of asth- 
matic children. Miller and Baruch (1951, 1957) 
reported studies of 201 allergic children (in- 
cluding asthmatics) and their parents: 97% of 


1 This article is based upon a dissertation submitted 
in candidacy for the degree of Doctor of Philosophy at 
Michigan State University, August 1959. The writer is 
indebted to the members of his dissertation committee: 
Albert I. Rabin (chairman), Charles Hanley, and 


Donald M. Johnson for their invaluable advice and 
encouragement 

2 Now at Alcoholism Treatment Center, Highland 
Park Health Department, Highland Park, Michigan. 
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the mothers were described as having “verbally 
expressed their rejecting attitude toward these 
children”; only 37% of control mothers 
(mothers of “problem” children) expressed 
similar feelings of rejection. While these differ- 
ences were statistically significant, they were 
apparently based on subjective evaluations by 
Miller and Baruch themselves. More standard- 
ized, objective measuring procedures would 
seem more desirable. Another objection to this 
type of research lies in the lack of a chronically 
ill but nonpsychosomatic control group. With- 
out such controls, the alleged rejecting atti- 
tudes of the allergic mothers may be attributed 
to their reactions to caring for a chronically il] 
child, often an oppressive burden, both eco- 
nomically and psychologically. Demanding 
and often irritable, such children require a 
mother to make many personal sacrifices to 
properly care for them. Many mothers feel 
personally responsible for their child’s illness, 
and this added strain often aggravates other- 
wise quiescent conflicts, introducing a “re- 
active factor.” 

Cutter (1955) reported studying 33 mothers 
of asthmatic children by means of a question- 
naire, measuring parent-child interaction on 
“warmth,” “freedom,” and “control.” Sta- 
tistically significant differences were not found 
between experimental and control groups. 
Fitzelle (1959) studied a group of 100 mothers 
of asthmatic children, using the MMPI and 
the USC Parent Attitude Inventory. Again, no 
significant differences were found. These two 
studies, then, raise doubt about the validity of 
clinical claims regarding the pathogenicity of 
maternal attitudes in the genesis of bronchial 
asthma, 

Some clinicians, however, may question 
whether propositions derived from psycho- 
analytic observations, involving intensity and 
character of psychological conflicts, can be 
adequately tested by instruments basically de- 
signed to measure attitudes regarding child 
rearing behavior (e.g., the USC Parent Atti- 
tude Inventory). Clinical reports have sug- 
gested that some mothers of asthmatic children 
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overindulge their children, whereas others are 
openly hostile and rejecting. Both groups, 
however, may share the same core conflicts, 
centered on unresolved dependency relation- 
ships with their own mothers. Their observable 
behavior and their attitudes regarding child 
rearing could, therefore, be quite different and 
have the effect of cancelling the extremes. As 
aresult, experimental and control groups may 
show similar scores on attitude scales. It is also 
apparent that the intensity of psychosexual 
conflicts cannot be easily inferred from MMPI 
indices, which are not directly relevant to the 
testing of hypotheses by psychoanalytically 
oriented clinicians who have studied asthma. 

This study has attempted to answer some of 
these objections. A projective test based upon 
psychoanalytic constructs (Blacky Pictures 
Test) was included in the test battery. All tests 
employed could be objectively scored. A 
chronically ill but nonpsychosomatic control 
group was provided to aid in evaluating the 
reactive factor. Such a design may more 
adequately test the notion that the mother- 
child relationship plays a primary, pathogenic 
role in the etiology of bronchial asthma. It was 
hypothesized that mothers of asthmatic 
children appear more emotionally disturbed on 
the tests selected than mothers of comparable 
control subjects. 


METHOD 
Sample 


The sample for this study was drawn from the 
mothers of patients treated at Children’s Hospital in 
Detroit, Michigan.* The experimental group consisted 
of mothers of asthmatic children in the Allergy Clinic 
A mothers). A group of mothers of chronically ill 
children was selected from the cases in the Rheumatic 
Heart Clinic to serve as a chronically ill, nonpsychoso- 
matic control group (RH mothers). No women were 
selected for this control group if any of their children 
had a severe allergic disease. Because rheumatic fever 
isa disease that is usually equal to or greater in severity 
than asthma (Friedberg, 1956, p. 858), the RH group 
represented very conservative, chronically ill controls. 

A second control group was selected from the 


* Appreciation is expressed to the following members 
of the staff of Children’s Hospital for their assistance 
in obtaining subjects for this study: Joseph Fischhoff 
consulting psychiatrist); Manes S. Hecht (Director of 
Rheumatic Heart Clinic); Samuel J. Levin (Director 
ot Allergy Clinic); Charles N. Weller (Director of 
Surgery Clinic); and Paul V. Woolley, Jr. (Pediatrician- 
in-chief). 
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mothers of relatively healthy children. Nineteen 
mothers were chosen from the Surgery Clinic. Their 
children were in the hospital for such routine operations 
as appendectomies. Six mothers were selected from the 
Outpatient Department (OPD), where the patients 
were being treated for minor cuts and burns. Mothers 
were tested several days after their childrens’ operation 
or clinic treatment so that any situational anxiety 
attendant upon the hospital procedures might be 
dissipated. Only those mothers were included who had 
no allergic or chronically ill children. This group is 
referred to as the surgery-o.p.d. (SOPD) mothers or 
healthy controls. 

Table 1 indicates that the three groups were drawn 
approximately from the same socioeconomic, religious, 
and racial groups. In general, these mothers were 
lower middle to upper lower in status. The three groups 
were matched with regard to race because of Negro- 
white differences in the role of the mother in family life 
(Frazier, 1951). That this was a necessary precaution 
will be seen in a subsequent section of this paper.‘ 


Procedure 


Mothers were asked to participate in the study by 
either their physician or by the investigator. All mothers 
who appeared at their respective clinics during a 
period of several consecutive weeks were asked. In 
almost every case they agreed to participate. They 
were told by the investigator that this study of the 
relationship of maternal attitudes to illness was being 
conducted in many clinics throughout the hospital. 
They were tested at the hospital in groups of three to 
five mothers. Five to six mothers in each group, who 
volunteered to participate but subsequently were 
unable to come to the hospital for testing, were tested 
at home. Therefore, neither refusal to participate nor 
inability to complete the testing procedure constituted 
a serious source of bias in this study. 


Test Battery 


The Blacky Pictures were selected as the major test 
in the test battery because it is specifically based upon 
psychoanalytic constructs. It can also be group 
administered and objectively scored. Each of the 
Blacky cartoons is concerned with either a particular 
stage of psychosexual development or type of object 
relationship and can be scored for intensity and char- 
acter of conflict in that particular area. The subject’s 
total performance on each card is summed up in the 
Overall Dimensional Score. A “fairly strong” or “very 

‘The background data obtained from the mothers 
in regard to ordinal position of their patient-child is 
also worthy of attention. Many have speculated about 
a possible link between ordinal position and asthma; 
they have observed that asthmatics are often oldest 
children. Jessner, Lamont, Long, Rollins, Whipple, and 
Prentice (1955) found that 19 out of 28 asthmatics in 
their sample were oldest or only children. This was not 
the case in this study, for only 11 out of 25 cases were 
oldest or only children, the differences between experi- 
mental and control groups in this regard were not 
statistically significant. 
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TABLE 1 
CHARACTERISTICS OF EXPERIMENTAL AND CONTROL SUBJECTS 
= 3 | — — 
| Race Religion | Education Income Marital status 
Group " i € — 
; : ia : Years Total family Livi rith | « 
Negro White Protestant | Catholic counted net y ce ee . Separated 
Asthma mothers 15 10 17 8 11.7 $3,921.28 19 6 
RH mothers 5 10 19 6 11.0 $4,063.00 | 22 3 
SOPD mothers 15 10 | 24 1 11.5 $4, 134.60 22 3 


strong’’ overall dimensional score reflects conflict. A 
“weak” score indicates a relative lack of conflict in 
that area. The Revised Scoring System for Research 
Use (Female form) by Blum (1951) was used.' Code 
numbers were assigned at random to all protocols so 
that scoring was done without knowledge as of whether 
a particular test belonged to an experimental or control 
subject. 

The second test in the battery was the Parental 
Attitude Research Instrument (PARI), devised by 
Schaefer and Bell (1955). Form IV of the PARI was 
used, consisting of 23 five-item scales measuring 
attitudes toward child rearing and family life. This 
instrument was selected because it would provide data 
complementary to the Blacky Pictures in its assessment 
of relatively more conscious, attitudinal material, and 
also because it is comparable to the other attitude 
measuring devices used in earlier studies. 

PARI items are measured on a four-point scale 
ranging from strongly agree to strongly disagree. The 
majority of the scales are so constructed that less 
desirable child rearing attitudes are associated with 
agreement. This type of scale construction has been 
criticized because scores can be distorted on the basis of 
acquiescence as a response bias. Therefore, an 
Acquiescence scale was constructed to control for this 
factor. This 30-item scale was based on 15 items drawn 
from the F Scale and 15 reversed content F Scale items, 
prepared originally by Jackson and Messick (1957). The 
acquiescence items were interspersed among the 115 
items of the PARI. 


The following two hypotheses were tested 
in this study: 

1. More mothers of asthmatic children are 
characterized by intense psychosexual conflicts 
than are the mothers of controls. This is mani- 
fested by “stronger” Overall Dimensional 
Scores on the Blacky Pictures Test. 

2. The mothers of asthmatics exhibit more 
psychologically damaging attitudes regarding 
child rearing and family life than the mothers 
of control subjects. This is manifested by a 
higher score on the pathogenic scales of the 
PARI. 


“e 


* The writer is grateful to Gerald S. Blum for his 
supervision in scoring the Blacky protocols 


RESULTS 


Table 2 consists of 26 comparisons made 
between the experimental and control mothers 
on their Overall Dimensional Scores on the 
Blacky Pictures Test. The chi square technique 
was used to analyze the data. The Yates cor- 
rection was utilized whenever the expected 
frequencies were below 10. Three of the tests 
were statistically significant; in each case, the 
differences were in the predicted direction. On 
all tests, the asthma mothers had higher Over- 
all Dimensional Scores, the indices of greater 
psychosexual conflict, than the control mothers. 
Three significant tests out of 26 are slightly 
more than would be expected by chance alone. 
Therefore, the first hypothesis is essentially 
supported by these data. 

An analysis of the content of the statistically 
significant tests afforded hints as to the nature 
of the possibly major conflict areas of the 
asthma mothers. They were most disturbed in 
their Oedipal relationships. On Card IV (Oedi- 
pal Intensity) they had higher scores than the 
mothers of both the chronically ill and healthy 
controls. Moreover, the differences on Card IV 
between the A and SOPD mothers were greater 
than the differences between the A and RH 
mothers, as expected from considerations of 
the reactive role of physical illness in children 
in generating maternal conflicts. The A mothers 
also had significantly higher scores than the 
SOPD mothers on Card I (Oral Eroticism). 
The RH mothers had higher scores than the 
SOPD but lower scores than the A mothers on 
this dimension. In comparing the RH and 
SOPD mothers with each other, one statisti- 
cally significant test was found on Card VIII 
(Sibling Rivalry). The RH mothers had the 
higher Overall Dimensional Scores, suggesting 
greater psychopathology. 

A separate analysis of the Overall Dr 
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TABLE 2 
OVERALL DIMENSIONAL Scores FOR TOTAL SAMPLE 
(wm = 75) 
A RH SOPD : 
others others others 2 betweer 2b ee 
panty hols ee = Nand RH | A‘and SOPD 
dimension ~ mothers mothers 
+5 ot +8 ob 4a o> 
I. Oral Eroticism 14 | Il 12 13 6 19 | .32 $s." 
Il. Oral Sadism 6 | 19 9 16 10 15 .38 .82 
III. Anal Sadism (Exp.) 10 | 15 8 17 9 16 .10 0.00 
Ill. Anal Sadism (Ret.) 13 12 13 12 13 12 0.00 0.00 
IV. Oedipal Intensity 22 3 16 9 14 11 .74* 4.86** 
V. Masturbation Guilt 9 16 12 13 10 15 .74 0.00 
VI. Penis. Envy 0 25 2 23 3 22 .52 1.42 
VIL. Identification Process 19 6 16 9 16 9 .38 .38 
VIII. Sibling Rivalry 14 11 18 7 10 15 we 1.28 
IX. Guilt Feelings 8 17 13 12 11 14 2.06 34 
X. Ego Ideal 10 15 10 15 8 17 0.00 0.10 
XI. Narcissistic Love Object 18 7 23 2 19 6 2.18 0.00 
7 12 13 14 11 2.06 0.36 


XI. Anaclitic Love Object 17 8 


Note.—All x? values represent one-tailed tests 





* + represents “very strong” and “fairly strong’’ Overall Dimensional Scores 


> 0 represents “‘weak”’ Overall Dimensional Scores 
* » .0S-.025 
** » .025-.01 


mensional Scores along racial lines was made. 
The same trends continued: A mothers, in 
both the white and Negrogroups,demonstrated 
greater conflict on both Cards I and IV. 

Additional perspective can be obtained by a 
further, more qualitative analysis of some of 
the components of the Blacky data that are 
summed up by the Overall Dimensional Score.*® 
The “strong’’ spontaneous stories of the A 
mothers to Card I were evasive, either ignoring 
or minimizing the fact that the mother dog is 
nursing Blacky. In contrast, the second largest 
group wrote frank stories expressing an intense 
desire for food on Blacky’s part, describing 
Blacky as being “a little greedy” and as one 
who “loves to eat.’”’ It would seem, then, that 
these mothers were quite concerned about their 
nurturance needs. While they typically at- 
tempted to handle them by evasion, there 
were some cases in which the conflicts were so 
great or the defenses so weak that urgent oral 
concerns were quite openly revealed. 

Evasion was also characteristic of the 
“strong” stories produced by the A mothers 


*The Overall Dimensional Score is based on the 
subject’s spontaneous story produced in response to 
each card, his responses to multiple-choice Inquiry 
items, related comments in regard to that particular 
dimension produced anywhere in the protocol, and on 
ratings of cartoons as to preference. 


to Card IV. This card depicts an angry Blacky 
watching his mother and father making love. 
One A mother wrote: “Blacky has found 
Mama and Papa kind of spooning.... She 
likes these little episodes to occur as it makes 
her happy too.” Such data suggest that the A 
mothers are attempting to contain their con- 
flicts with primitive denial mechanisms. The 
second largest group of strong stories produced 
by the A mothers was quite candid about their 
jealousy, e.g., “Blacky is jealous because she 
is in love with Papa more than with Mama. 
She doesn’t like the attention that Papa is 
giving Mama. . . . She’s unhappy and jealous.” 

This qualitative analysis can be further pur- 
sued by considering those Inquiry items which 
approached or achieved statistical significance 
as in Table 3. Item IV, 3a and b, pointed up 
the greater involvement of the A mothers in 
Oedipal conflicts than the RH mothers. The A 
mothers seem to be more closely identified 
with their own mothers and, at the same time, 
tend to see their mothers as the disciplinary 
figure in the household more than is the case 
with the RH mothers (VII, 1a and c; VII, 3a 
and c). The A mothers also seem to be less 
concerned with masturbatory conflicts than 
the RH mothers (V, 2a), but they do not seem 
as confident about Blacky’s possibilities of 
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TABLE 3 
IngurRY CHOICES FOR TOTAL SAMPLE 
(mn = 75) 
Card Question and answer selected Pah me x? es — wal 
IV 3. Which one of the following makes Blacky most unhappy? 
a) Mama keeping Papa all to herself A> RH 3.06 | .10-.05 
A > SOPD 3.00 10-.05 
b) The idea that Mama and Papa seem to be ignoring her on RH>A 5.18 | .05-.02 
purpose. SOPD >A 5.18 | .05-.02 
V 2. How might Blacky feel about this situation when she is older? 
a) Happy without a care in the world A > RH 2.82 | .10-.05 
Vil 1. Who talks like that to Blacky—Mama or Papa or Tippy? 
a) Mama A > RH 2.94 10-05 
b) Tippy RH >A 6.60 | .02-.01 
VIL. 3. Whom is Blacky imitating here? Mama or Papa or Tippy? 
a) Mama A> RH 3.36 | .10-.05 
c) Tippy RH >A | 4.78 | .05-.02 
VIII 3. Who does Blacky feel is paying more attention to Tippy? 
b) Papa A > SOPD 4.72 | .05-.02 
X 5. Actually, what are Blacky’s chances of growing up to be like 
the figure in her dream? 
RH>A 4.13 | .05-.02 


a) Very good 


* Significance levels represent two-tailed tests. 


TABLE 4 
PARI Scores or A Morners, RH MorHers, 
SOPD Moruers ror NEGRO SAMPLE 
(m = 45) 


AND 


rar Analysis of 
Analysis of variance 


~caie 
covariance 
5 4.10* 2.32 
10 3.38" 1.84 
* F ratio significant at .05-.01 level 
rABLE § 
Tota PaTHOoGENIC Scores OF NEGRO MOTHERS 
4s COMPARED WITH WHITE MOTHERS 
(m = 75) 
Analysis of Analysis of 
variance covariance* 
F Ratio 9.95 1 
Level of significance 01 ns 


* Adiusted for acquiescence 


growing up to be like her ego ideal as do the 
RH mothers (X, 5a). 

As compared with the SOPD mothers, the 
A mothers also appear more conflicted in the 
area of Oedipal relationships (IV, 3a and b). 
rhere is also evidence of greater sibling rivalry 
the A mothers (VIII, 3b). 
There seems to be an Oedipal flavor to these 


on the part of 


feelings, inasmuch as the father is the parental 


figure toward whom the rivalry for affection is 
directed. 

The PARI analysis allowed a test of Hy- 
pothesis 2. Analysis of covariance allowed for 
controlling the effect of acquiescence on the 
PARI scores. The mothers were first compared 
in terms of their total Pathogenic scores. When 
no significant differences were found, they were 
then compared by analysis of variance on each 
of the 23 PARI scales. A significant F ratio 
was obtained only on Scale 23 (Dependency of 
Mother scale). Since one significant F could 
have been obtained by chance in 23 tests, the 
hypothesis was not confirmed. 

The PARI data were also analyzed by racial 
groups. The total Pathogenic scores were again 
not found to be significantly different. The 
data for the racial groups were also analyzed 
separately for each of the 23 component 
scales. No significant differences emerged. 

Interestingly, these analyses would have 
been altered had there been no control for 
acquiescence response set. This can be seen in 
Table 4. Two significant F ratios disappeared 
when the means were adjusted for acqui- 
escence. A more striking example was found 
when the Negro and white samples were 
compared on total Pathogenic scores. In Table 
5 it can be seen that the Negro mothers had 
significantly higher Pathogenic scores when the 
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analysis of variance results alone were con- 
dered. However, when adjustments were 
made for acquiescence, the differences in total 
Pathogenic scores vanished. 


DISCUSSION 


An overall evaluation of the Blacky and 
PARI results indicates some support for the 
psychoanalytic formulations of the psycho- 
dynamics of the mother-child relationship in 
asthma. In this population, mothers of asth- 
matic children appeared to be more emotion- 
aly disturbed than mothers of nonasthmatic 
children. However, the reported differences 
were very slight in relation to the claims of 
some clinicians regarding the etiological in- 
volvement of psychological factors. In view of 
the small number of subjects in this study and 
the slight differences in test performance of the 
experimental and control groups, considerable 
caution must be exercised in interpreting the 
findings. Replication of this study is essential 
before one may conclude that the psychoso- 
matic hypothesis has been genuinely confirmed. 

Those who state that much of the psycho- 
logical distress of the mother of the asthmatic 
child is reactive to the child’s illness can draw 
some comfort from these data. The mothers of 
the chronically ill children appear to be more 
psychologically disturbed than those of healthy 
children. However, this reactive factor does 
not account for all of the variance; the A 
mothers appeared to be more emotionally dis- 
turbed than the RH mothers. Nevertheless, 
this reactive factor has been too often ignored 
in the psychosomatic literature. It may account 
for a major portion of the pathology which 
clinicians have observed in the mothers of 
psychosomatic patients. 

The content of the mothers’ psychosexual 
conflicts is also of interest. The fact that the 
A mothers had higher scores on the Oral 
Eroticism dimension is consistent with the 
psychoanalytic speculations. Mothers of asth- 
matic children have been described as being 
overly dependent and having inordinate 


nurturance needs by Sperling (1949), Jessner 
et al. (1955), and Coolidge (1956). The possi- 
bility that they are particularly disturbed in 
the area of Oedipal relationships, as is hinted 
in the present study, while implicit in all of 
these writings, has not been made a point of 
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special attention. The Inquiry data point to 
complex ramifications of the A mothers’ possi- 
ble Oedipal conflict. For example, they ex- 
pressed their sibling rivalry in terms of compe- 
tition for the father’s rather than the mother’s 
attention. They also were more closely identi- 
fied with their mothers and did not expect to 
realize their ego ideals. These latter two trends 
in the Inquiry data are consistent with height- 
ened Oedipal conflicts. Identification with the 
mother is assumed to be an integral step in a 
girl’s attempt to resolve her Oedipal problem; 
an exaggerated attempt at identification can 
follow from an exacerbated Oedipal conflict. 
If a woman renounces her ability to achieve 
her ego ideals (e.g., to become an attractive 
woman, desired by men), she cannot be 
“accused” of being a serious rival of her 
mother’s in bidding for her father’s affections. 
In this fashion, anxiety over Oedipal wishes is 
allayed, 

Finally, in view of the large claims of clini- 
cians regarding the “asthmatogenic mother,” 
it may be useful to consider some different in- 
terpretations of the very modest trends sup- 
porting the psychosomatic position. First, it is 
possible that the present findings are limited 
to a lower income group. Perhaps larger dif- 
ferences in favor of the psychosomatic hy- 
pothesis would be found in higher socioeco- 
nomic groups. This is especially likely since 
psychoanalytic formulations have been based 
primarily on work with such patients. Another 
possibility lies in the fact that the mother may 
not be as potent an etiological factor in asthma 
as is often believed. Psychological causation 
may be a more general phenomenon. In one 
case, it may be the father who plays the pri- 
mary role; in another, it may be the siblings. 
In still a third, it may be the interaction of the 
entire family. Consequently, perhaps small 
differences are all that can be expected when 
attempts are made to isolate single etiological 
factors which are presumed to be responsible 
for large, complex phenomena. It is also possi- 
ble that dynamic involvement is primary in 
only a few cases and the reactive factor in 
many. Future studies could profitably compare 
severely asthmatic with mildly asthmatic 
groups to determine whether psychological 
processes vary, as some allergists believe, with 
severity of illness. 
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Some of the most important implications of 
the present study are methodological. The 
PARI analysis indicated that an acquiescence 
response bias drastically alters findings with 
this instrument. The Negro group, for example, 
was far more acquiescent in this study than 
the white sample. Therefore, their Pathogenic 
scores were spuriously higher than the whites’ 
indices. There are pitfalls in using attitude 
scales with multiple-choice responses along an 
agree-disagree continuum for purposes like the 
present ones. 

It has also been demonstrated that a chroni- 
cally ill control group is very necessary in this 
type of psychosomatic research. Without such 
a group, the reactive factor cannot be properly 
assessed. A large majority of psychosomatic 
studies do not provide for such a control group, 
and differences reported in such studies are un- 
warrantedly attributed wholly to dynamic 
processes. 

The fact that the Blacky test was able in 
some degree to differentiate the clinical groups 
while the PARI was not is also worthy of note. 
Perhaps projective tests are more appropriate 
for research involving psychodynamic phe- 
nomena of the kind considered here. As was 
previously indicated, the same underlying con- 
flicts may accompany totally different person- 
ality types. The conscious attitudes of indi- 
viduals may differ although the underlying 
pathology is identical. The performance of such 
varying types of personalities on attitude 
questionnaires would therefore not be in a 
given direction, and as a group, such subjects 
may appear similar to controls. 

Finally, this study represents some advance 
in the validation of the Blacky Pictures. Since 
the Blacky test was able to differentiate the 
asthmatic mothers from control groups, more 
or less along lines predicted by clinicians, this 
can be interpreted as another confirmation of 
its discriminative power. While the evidence is 
by no means entirely persuasive, at this stage 
of development in projective test validation, 
the Blacky to represent one of the 
better methods. 
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SUMMARY 
The etiological role of the mother-child rela- 
tionship in the genesis of bronchial asthma has 
long been implicated in psychosomatic theory 


MARVIN MARGOLIS 


An attempt was made to examine this notion 
by studying the psychosexual conflicts and the 
attitudes towards child rearing and family lif 
of 25 mothers of children with bronchia 
asthma. They were compared with 25 mothers 
of chronically ill children and 25 mothers of 
healthy children. The test battery consisted of 
the Blacky Pictures and the PARI. 

The mothers of asthmatic children appeared 
more emotionally disturbed than the control 
mothers on the Blacky, manifesting greater 
disturbance in the areas of Oral Eroticism and 
Oedipal Intensity. 

The reactive role that the child’s chronic jj]. 
ness plays in aggravating the psychological 
conflicts of the mothers is suggested by the fact 
that in both Cards I and IV, while A mothers 
gave evidence of being the most psychologi- 
cally disturbed, the healthy control group ap- 
peared least disturbed. The chronically ill con- 
trol group fell between the two. 

There were no significant differences in the 
PARI performance of the experimental and 
control groups. 

An acquiescence response bias was shown t 
influence the scores obtained on the PARI 
This bias seems to limit the use of this instr- 
ment. 

In general, this study gives a very sma 
measure of support to the psychosomatic hy- 
pothesis that the mother-child relationship 
plays a primary etiological role in the genera- 
tion of asthmatic symptoms. 
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OBJECTIVE COMPARED WITH SUBJECTIVE MEASURES OF THE 
SAME BEHAVIOR IN GROUPS! 


ERNEST B. GURMAN 
Mississippi Southern College 


REVIOUS studies in the research program 
from which the present paper originates 
have shown that the objective measures 
of social interaction described by Bass, Gaier, 
Flint, and Farese (1957) are reliable (Bass, 
1959b; MacDonald, 1960). In addition, the 
objective measures of stability and coalescence 
of opinion from before to after discussions 
have been shown to covary meaningfully with 
each other (Bass, 1955b). Further, Flint and 
Bass (1957) have demonstrated the construct 
validity of the objective measures of influence 
or leadership by noting that they are associated 
with ability, esteem, and subsequent merit. 
The present study also explores properties 
of these objective measures of social inter- 
action, which are based on the rank-difference 
correlations of judgments of members of a 
group before and after a discussion about the 
judgments. It examines the agreement between 
the objective measures of interaction and 
observers’ and participants’ ratings about what 
occurred during and after a group’s interaction. 
If extremely high correlations were to be 
found between the objective and subjective 
assessments, either assessment could serve in 
the place of the other. If no correlation were 
found, it would put into question the meaning- 
fulness of what small group observers report 
and/or of the objective measures. If, as 
expected, moderate correlations should emerge, 
the objective and subjective assessments of 
such social concepts as successful leadership 
would appear to share some common variance, 
although reliance on observers and participants 
alone would be inadequate to describe the 
behavior completely, if behavior, per se, were 
the focus of study. Similarly, if observers’ or 
participants’ perceptions should be the focus 


1 This work was aided by funds from the Louisiana 
State University Council on Research and Contract N7 
ONR 350609. Parts of the report were included in a 
thesis of the senior author partially fulfilling the 
requirements of a master’s degree at Louisiana State 
University 

Now at the University of California, Berkeley, 
1961-62 
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AND BERNARD M. BASS? 


Louisiana State University 


of a study, objective assessment alone would 
be inadequate. 


METHOD 


Subjects 


The subjects consisted of 42 male and 18 female 
undergraduates from an introductory psychology 
course. The 60 students were all volunteers and were 
randomly assigned to 12 groups of five members each. 

Each subject was to be given an innocuous report 
after the completion of the experiment. With only this 
offer as an incentive, the groups were expected to have 
relatively low motivation to perform, a desirable 
condition in view of the higher reliabilities found for 
low motivation groups by Bass (1959b). 


Procedures 

Five case histories including a practice problem 
were discussed by each group. Five possible solutions to 
each problem were provided. 

Figure 1 shows such a case. The five alternatives are 
not necessarily the five best solutions but rather five 
from a much larger list which were originally judged by 
a large sample of subjects to be fairly close in quality 
but still reliably different from each other in value 
Closeness in quality serves to produce a discussion 
problem of considerable difficulty and conflicting 
opinion, yielding maximum opportunities for inter- 
personal influence and change. As expected theoretically 
(Bass, 1960b, p. 136) and confirmed empirically (Bass 
& Flint, 1958), there is a high correlation between how 
ambiguous or difficult the problem confronting a group 
is, and how much influence occurs in the group. 

The alternatives are shown here by worth accord- 
ing to two graduate psychology classes of 25 students 
each. Interclass agreement was .9. 

The objective measures of social interaction were 
collected using the analog computer described by Bass, 
et al. (1957). The computer indicates directly the rank- 
difference correlation of the various sets of rank-orders 
of merit of the five solutions to a case history registered 
by each subject before and after discussion of that case 
history and its five solutions. 

Each of the five members has his own panel, housing 
two columns of five-position selector switches. In the 
left column (X), the member sets his initial desired 
rank for each alternative, using the topmost switch 
for the first alternative, the second switch for the 
second, and so forth. If a subject ties ranks by mistake, 
a light on the experimenter’s panel automatically 
signals the experimenter, who notifies the subject that 
he must correct the error. In the right column ol 
switches, the member sets his final (Y) rankings, after 
discussion. The experimenter’s panel contains banks ot 
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Case History A 
The Case History of Paul 


Paul, a sophomore at a state university, knew that 
a certain group of boys had bribed a person in the 
mimeograph office and obtained an important exam. 
He knew that if he exposed the bribe, the exam would 
be changed, but the people involved, many of whom 
he knew quite well, would be caught and since the 
university enforced the rules against cheating very 
strictly, would probably be suspended from the school, 
or at least be given an F in the course. Such an action 
would obviously make Paul extremely unpopular with 
the students during the rest of his stay. He could afford 
to go nowhere else. 

Paul was an average student, but a series of personal 
problems last semester has affected his studies and 
caused him to be put on probation. He had to pass this 
very rough course to stay in college, and he was just on 
the borderline between passing and failing. 

The gang with the stolen exam offered to cut Paul 
in on it, since they knew Paul had seen a copy in the 
hands of one of the fellows in the dorm; but Paul had 
strong moral feelings against cheating and had turned 
down the offer. But since the course was graded on the 
curve, he felt the added advantage of the others would 
be sufficient in such a small class to cause him to fail. 
What should he do? 


A. Try to convince the other fellows not to use the 
exam to study by. 

B. Paul could inform officials that the exam had been 
passed among the students; he could do this in a 
letter, hence would not involve himself. 

C. Consult with teacher. 

D. Keep mum and take test as is. 

E. Seek aid to problem from minister. 


Fic. 1. Human relations problems: 
Solutions provided. 


switches for recording the group decision (G) and 
criterion rankings (R) when these are available. 

A sliding door cover on each subject’s panel prevents 
him from observing the X and Y columns simul- 
taneously. 

The experimenter can read on a calibrated ammeter 
any rank-difference correlation between any of the 12 
sets of ranks (X,, X2- - - , G, R) by closing the appro- 
priate ammeter switch on his own panel. He can 
privately communicate these readings back to members’ 
own ammeters as desired, although such was not done 
in the current study. 

The participants were given 3 minutes on the 
practice problem and 10 minutes for each case history 
the cases A, B, C, and D are reported in Bass, 1960a). 
Order of presentation of the four cases was randomized. 
The time allowed was considered sufficient since most 
groups reported final decisions before reaching the 
tume limit. 

Each group member first read the discussion problem 
and privately ranked the solutions on his panel before 
beginning the discussion (X). 

The group then was asked to enter into a discussion 
and arrive at a decision as to the best rank ordering. 


After the group decision (G) was reported to the experi- 
menter, each subject, without referring to his first 
rankings, once again ranked the solutions in the order 
he preferred (Y). 

Two observers rated the groups after each case 
history on each of the same variables purportedly 
assessed objectively. In addition, each participant 
made one rating on each variable of all members 
(including himself) at the termination of each discussion 
session. 


Objective Measures 


Stability. The rank-difference correlation between 
each subject’s rank judgments before (X) and after (Y) 
a discussion provided an objective measure of that 
subject’s stability of opinion for that discussion. This 
rho correlation is symbolized hereafter as XY. 

Initial agreement with group decision. This was the 
rank-difference correlation between a subject's initial 
judgment (X) and the group decision (G), symbolized 
as GX. 

Acceptance of the group decision. This was the rank- 
difference correlation between a subject’s final opinion 
(Y) and the group decision (G), symbolized as GY. 

Initial agreement with others. This was each subject’s 
mean correlation initially with all other members of his 
group, symbolized as XX. 

Final agreement. This was each subject's mean 
correlation after discussion with all others in his group, 
symbolized as YY. 

Altempted leadership. This was the total amount of 
time spent talking by a subject during a discussion 
(T). (For a discussion of the validity of this objective 
measure, see Bass, 1959a.) 

Public success as a leader (L). Public successful 
leadership of a subject was the extent the group decision 
correlated with that subject’s initial ranking rather 
than with other members’ judgments and the extent 
the subject agreed more with the group decision relative 
to his initial agreement with others. Algebraically, the 
public successful leadership of member i was equal to 
the amount the group decision agreed with member #’s 
initial decision (GX), less the amount the group decision 
did not agree with the other members’ initial rankings 
(GX), plus the amount member i’s final decision agreed 
with group decision (GY), less the average amount he 
agreed initially with the other members (XX). (Subse- 
quent research had indicated that L is not as valid a 
measure of successful leadership as A, relative success 
as a leader, Bass, 1959a.) 


Subjective Measures 


After each discussion, observers and participants 
described each discussant’s behavior on seven variables. 
The two observers operated without knowledge of the 
actual meter readings collected by the experimenter. 
They could only judge each member’s behavior on the 
basis of what they heard him say or saw him do during 
the discussion. The participants, likewise, had no access 
to meter readings but had to make their assessmenis 
of each other’s discussion behavior on the basis of what 
they saw or heard and on what they, themselves did. 
(One possible additional complex cue was provided 
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by the clicking of the switches. For example, one might 
infer rightly, or sometimes wrongly, that much clicking 
meant much changing of opinion.) To indicate how 
much a particular behavior was exhibited by each dis- 
cussant, raters used a five-point scale as follows: 40, A 
great deal; 30, Fairly much; 20, To some degree; 10, 
Comparatively little; 0, Not at all. 

The correspondence, by definition, between the ob- 
server and participant-rated behavior and objective 
measures is as follows: 


OBJECTIVE SYMBOL BEHAVIOR RATED BY OB 
MEASURE SERVERS AND BY 
PARTICIPANTS 
Stability (XY) He maintained his initial 
decision. 
Initial agreement (GX) The group decision agreed 


with his initial private 
decision. 


with group de- 
cision 


Acceptance of (GY) His final private decision 
the group de- agreed with the group 
cision decision. 

Initial agreement (XX) He agreed privately with 
with others the rest of the group ini- 

tially. 

Final agreement (YY) He agreed privately with 
the rest of the group 
finally. 

Attempted lead- (T) He aitempted to influence 

ership others. 

Public success as_ (L) He successfully influenced 
a leader others publicly. 

RESULTS 

Reliability of Observers 


The two observers’ ratings of the 60 subjects 
were correlated for each variable on each 
problem. The resulting 28 correlations were 
corrected by the Spearman-Brown formula to 
provide an estimate of the reliability of a 
composite observation based on the two obser- 
vers. The four reliability estimates for a 
measure were averaged via Fisher’s z trans- 
formation, and this average was then corrected 
with the Spearman-Brown formula to obtain a 
final estimated reliability of a score based on 
two independent observations of a total of 
four problems. Table 1 shows these estimated 
reliabilities of the observations pooled for two 
observers across four problems. 

The estimated reliabilities ranged from .63 
to .94 for the seven variables. Observations 
were more reliable when two observers rated 
overt rather than covert forms of behavior. 
The highest reliability of .94 occurred when 
observers rated T. Next most reliable were 
observers’ ratings of L. The most unreliable 
observer ratings were of XX before any 
discussion began. This is understandable since 
the observers could only estimate the amount 
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TABLE 1 


CorrEcTED SpLit-HALF RELIABILITIES OF 
OBSERVER RATED MEASURES OF 
INDIVIDUAL PERFORMANCES 








Esti- 
mated 
Problem reli- 
Behavior rated by observer or ability 
participants of four 
rob- 
— ems 
P com- 
A|B;}C|D | bined 
ls SNOT TT ioe 
He maintained his initial decision. |.30 |.58 |.36 |.62 79 
(xY) 
The group decision agreed with his |.23 |.36 |.60 |.44 73 
initial private decision. (GX) 
His final private decision agreed with |.62 |.56 64 |.45 |) 84 
the group decision. (GY) | | | 
He agreed privately with the rest of .47 |.26 |.44 |.00 63 
the group initially. (XX) 
He agreed privately with the rest of |.42 |.56 |.42 |.06 71 
the group finally. (YY) | 
He attempted to influence others. (T)|.82 1.81 |.73 |.79 % 
He successfully influenced others |.61 |.70 |.67 | : 
publicly. (L) | 





of XX from the direction of action and possible 
opinions expressed during the progress of the 
discussion. 


Reliability of Objective Measures 

Objective performance on a_ particular 
variable such as XY on each problem was 
intercorrelated with similar performance on 
evety other problem. The Spearman-Brown 
formula was used by MacDonald (1960) to 
estimate from this average intercorrelation the 
reliability of a test four times the length of a 
single problem. 

Each variable was treated in the same way. 
The results were generally lower than had been 
attained for 10-12 shorver problems about the 
familiarity of words or the size of cities requir- 
ing the same overall testing time (Bass, 1959b). 
The estimated reliabilities for combined 
scores based on the four human relations 
problems are as follows: XY = .31, GX = 
.20, GY = .69, XX = .45, YY = .31,T= 
.93, L = .40. 

These reliabilities were reduced considerably 
by one inconsistent case (Case D). They may 
have been lowered further by differences in 
problem content and difficulty. Although 
extrinsic reward was equal for all subjects, 
intrinsic motivation associated with content 
and the difficulty of the problem may have 
varied. Absence of feedback may also have had 
an adverse effect on consistency. In previous 
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reliability studies, feedback was given as to 
the correctness of the solutions. Since many 
of the solutions were arbitrary, the lack of 
feedback in the present study may have 
reduced the consistency of performance. 


Correlation of Observer Ratings and Objective 
Measurements 


As Bass (1960b, Ch. 2) suggested, the same 
construct may be defined by a variety of 
operations. The empirical associations among 
these operations indicates the validity of the 
construct that they define. A correlation of 
definitional validity indicates the amount of 
association between two different measure- 
ments, both of which are related to the same 
construct by different operations. Thus, the 
definitional validity correlation coefficient of 
.70 between the observers’ rating of how much 
participants talked and the clock reading of 
time the discussants talked provides evidence 
that both measurements can be regarded as 
“sense impressions” tied to the same construct. 
As McClelland (1951) has noted “. . . A con- 
cept is useful theoretically (1) to the extent 
that the person using it makes explicit its 
operational meanings and (2) to the extent 
that those operations cut across and are con- 
firmed in different areas of behavior” (p. 158). 

All observer ratings and all objective meas- 
ures were intercorrelated (by the product- 
moment method) to provide measures of 
definitional validity of the seven variables of 
social interaction. These definitional corre- 
lations are presented in Column 1 of Table 2. 
All of the correlations of definitional validity 
except that of XY and XX were significant at 
the .01 or .05 level. 

The more public the variable, the higher was 
the correlation between objective and sub- 
jective measures. Thus, T, the most public 
measure, exhibited a definitional validity of 
10. Conversely, those measures with least 
definitional validity were mainly about private 
events, such as XX and XY, which produced 
the lowest correlations, .21 and —.09, respec- 
tively. The correlation of .21 of XX is under- 
standable in view of the fact that the observer 
had to estimate prediscussion agreement from 
the discussion that followed. The very low 
correlation of —.09 for XY, the only negative 
correlation, indicates that the observers were 
completely unable to sense who actually shifted 
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TABLE 2 


CORRELATIONS OF OBSERVER, PARTICIPANT, AND 
SELF-RATINGS WITH OBJECTIVE MEASURES 











(2) . 3) 
) Partici- S if- 
Measure en... ae e => chien ies 
correlations J nnn ood correla- 
tions tions 
XY — .09 (— .018)* .20 .16 
GX .38** ( .99) 12 .18 
GY .34** ( .59) ee .20 
XX 21 ( .74) .08 — 
YY .49** ( .93) .10 .24 
ry ~ wm Ge — .45** 
L a” ( .53) a © 








® Coefficients corrected for attenuation by the unreliability of 
the correlated measurements 

* p< 0S when r = (.25). 

** > < Ol when r = (.33). 


his opinions. But observers did seem to be more 
accurate in judging final private agreement. 

Estimates of the true definitional validities 
of the measures, assuming perfectly reliable 
measuring instruments and observers, are 
shown in parentheses in Column 1 of Table 2. 

The definitional validity obtained may have 
been due in part to the fact that the two 
measures related by definition are positively 
correlated with a third measure not related by 
definition. For this reason, Table 3 was con- 
structed to examine the extent to which the 
correlations reflected some generalized observer 
rating not directly associated with the behavior 
the observers were supposed to be rating. This 
type of examination has been advocated by 
Campbell and Fiske (1959), who propose that 
in addition to being able to measure constructs 
by two different methods, “Measures of the 
same trait should correlate higher with each 
other than they do with measures of different 
traits involving separate methods.” 

Table 3 compares the definitional validity 
of a measure with the correlation of observers’ 
ratings with all other objective measures not 
related by definition. Only in one instance, 
XX, was the average correlation of observers’ 
ratings with all other irrelevant objective 
measures equal to the correlation with the 
relevant objective measure related by definition, 
and in no case did any objective measure 
correlate on the average with all other irrele- 
vant ratings higher than it did with the 
definitionally relevant rating. 





372 


TABLE 3 


AVERAGE CORRELATIONS OF ALL OBSERVER 
RATINGS AND ALL OBJECTIVE 
MEASUREMENTS 
(Fisher’s z transformation used for averaging) 


Average correla- 
tion of observer 
ratings with all 


Definitional va- | Average correla- 
lidity correla- | tion of objective | 
tion between measure with | 

all other irrele- | 





Measure | observer ratings other irrelevant 
and relevant ob- vant (nondefini- (nondefini- 
jective measure | tional observer | tional) objective 
(by definition) ratings) measures 

XY —.09 .03 07 
GX .38 21 .19 
GY .34 21 «aa 
XX 21 my | 21 
YY .49 .16 .20 
T .70 .37 | 
L 31 | .16 .29 


Correlation of Self and Participant Ratings with 
Objective Measures 

Table 2 lists the correlations between 
participant ratings and objective measure- 
ments (Column 2) and between self-estimates 
and objective measurements (Column 3). 
It can be seen that generally the observers’ 
ratings had higher definitional validities than 
the validities of participant or self ratings. 

Like the observers, the participants were 
able to recognize the overt measures more 
accurately. Again, T yielded the highest 
correlation with its relevant objective assess- 
ment while XY and XX, both private events, 
produced the lowest correlations of definitional 
validity. 

Participants were about as close to objective 
measurements as were observers in gauging 
GY, T, and L. There were five participants 
whose ratings could be averaged compared to 
only two observers. This suggested that the 
ratings based on participant opinion were 
likely to be more reliable than those drawn 
from observers. Yet participants were less able 
to sense GX, YY, and T as objectively defined. 
Personal involvement seemed detrimental to 
perceiving who actually talked the most, who 
finally was most in agreement with others, and 
whose initial opinions were best matched by 
the group decision. 

Likewise, participants were less accurate in 
perceiving their own objective performance 
measures, GX, YY, and T, than were obser- 
vers. Moreover, each participant seemed less 
accurate in discerning how much he himself 
really accepted the group decision (GY). On 
the other hand, each participant was the most 
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accurate reporter of his own XX, in contras 
to the ratings of observers and participants, 
DISCUSSION 
A number of inferences may be drawn from 
these results. First the results for all the 


= 


measurements, except XY and XX, suppor | 


the argument that something meaningful 
differentially so in the sense of Campbell and 
Fiske (1959)—is being assessed by the meas. 
ures, GX, GY, YY, T, and L. 

Second, observers and participants were 
unable to sense objective private events, 
particularly XY—the stability of the dis 
cussants from before to after discussion, 
Contributing to the inability of members to 
sense private stability are coercive effects 
within the group itself, producing public but 
not private agreement. Coercion is brought 
about by the desire of the discussants for the 
discussion to move along smoothly and reach 
a group decision in the time allowed. The 
individual members are thus likely to indicate 
their willingness to “go along” with the group 
to avoid being a stumbling block in the path 
to the group goal, while still holding to their 
original opinions. Coercion also arises when in 
some cases there is little chance among men- 
bers to come to agreement and the group 
decision is reached by a majority vote. In this 
case, the out-voted members sometimes take 
the “I’ll show them” attitude and again rank 
the solutions privately exactly as they ranked 
them initially, even though other member 
were able to see other points of view and reach 
a common decision. Participants and observers 
may therefore see more final agreement than 
actually occurs. 


Third, despite their fewer numbers, 
observers tended to estimate objective per- 
formances more accurately than did _ the 


participants. Several reasons may be offered 
Participants only experience their own group 
and their own personal ranking within the 
group. They have no absolute knowledge 
across groups such as is available to an obser- 
ver. It is more difficult to be an involved 
participant-observer than only an observer. 
Fourth, the lack of understanding of what is 
actually happening in the group situation 
points to the potential value of feeding back 
to the members the true situation they are 
participating in as it is taking place (see Bass, 
1960a). The participants were unable to infer 
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successfully the extent of agreement among the 
members after the close of the discussion 
session. 

Fifth, the data suggest that participant 
ratings in small group behavior are open to 
serious question. No simple conclusion about 
the matter is possible for first of all investiga- 
tors may be particularly interested in mem- 
ber’s perceptions, per se, not what actually 
happened. Moreover, our data suggest that 
observers, participants, and _ self-appraisers, 
in the order named, are fairly good estimators 
of who talked the most in the group. But, 
beyond this, investigators interested in what 
actually happens in a group would find it most 
profitable to rely on observers rather than the 
participants themselves for gauging public 
events, if objective assessments are not avail- 
able, particularly if the reliability of observer 
ratings can be made greater than what was 
attained in this study. For private events, 
investigators would do well to seek objective 
measurements. 

Sixth, if these data have generality, then 
certain specific practices bear re-examination. 
For example, a common criterion of group 
effectiveness is the self-rating by members of 
how much each accepts the group decision. 
Our data suggest that this particular self- 
rating does not even correlate significantly 
above zero at the .05 level with objective 
indices of how much each member agreed 
finally with the group decision. Observers and 
other participants even seem better at making 
this particular estimate. 

Why should participants, as a whole, and 
observers, have been more accurate in assessing 
the average individual’s actual acceptance of 
the group decision than the individual himself 
evaluating his own degree of acceptance? The 
answer may be a matter of statistics, but the 
consequences are there, nevertheless. The 
superior accuracy of the other participants 
might simply have been due to the greater 
reliability of the pooled judgment of four 
members about a fifth than the fifth member 
about himself. The superior accuracy of 


observers may have been due to their ability 
to transcend group differences. Typically, we 
obtain consistent and intercorrelated differ- 
ences between groups in these problem situations 
(Bass, 1955b). Observers had an opportunity 
to evaluate both members and groups differ- 
entially augmenting the total correlation of 


their observations with the performance of the 
differing members in the differing groups. 
Individuals could only assess themselves based 
on their experience in one group. Any correla- 
tion between groups could not inflate the 
individual raters’ overall accuracy. 

Regardless of the reason, the 
correlation of only .20 between members’ 
stated acceptance of the group decision and 
their actual acceptance suggests increased 
caution concerning the utility of subjective 
estimates as criteria of group effectiveness. This 
low correlation of .20 for the 12 groups of 60 
subjects is consistent with a product-moment 
correlation of only .13 found for 19 similar 
groups of 95 college subjects in the extent they 
actually accepted the group decision (GY) on 
10 trials and the extent they rated themselves 
attracted to their group (Bass, 1955a). If 
rated satisfaction with group decisions does not 
really reflect actual acceptance of the group 
decisions, what does such subjective satis- 
faction measure? A review of conference 
research studies suggests that rated satisfaction 
with group decisions is much more closely 
associated with a member’s satisfaction with 
his own work at the conference than with 
group performance per se. For example, 
satisfaction with the group outcomes was found 
to correlate much more highly with personal 
matters than with such group indices 
number of agenda items completed (Peter- 
man, 1951). Conferees were less satisfied with 
decisions that took a longer time to make, yet 
“rapid decision-making may reflect apathy, 
not efficiency” (Bass, 1960b, p. 46). 

Seventh, the generality (or lack of it) of 
these data should be considered in the light of 
other possible subjective and objective assess- 
ments and modes of analysis. For example, the 
objective data were unrestricted in degrees of 
freedom by problem or group. In the same 
way, the observers were free to assign all 
members of one group higher ratings than all 
members of another group. Yet, psychologi- 
cally, participants were restricted in knowledge 
to what happened in their own group. Their 
anchor points were based on their own group 
only. Statistical results might have been some- 
what different if all subjective assessments had 
been within-group comparison judgments with 
constant means for each problem and each 
group, Or, given the original data of this study, 
correlations between objective and subjective 
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data could have been run problem-by-problem 
and group-by-group, and then averaged. This 
would no doubt have favored participants over 
observers as accurate estimators of objective 
performance. However, these results, although 
of interest theoretically, would have lacked 
practical implications, for they would have 
been results to be generalized only to situations 
where all groups are equal in performance on 
every problem. An analysis apportioning the 
covariance between each objective and sub- 
jective measure would permit estimating the 
independent effects of problem, group, and 
individual and the covariance between objec- 
tive and subjective measurement. Resources 
permitting, this may be done. 


SUMMARY 


The purpose of this study was to investigate 
the between subjective and 
objective of group behavior. 
Discussion behavior was measured objectively 
by a social analog computer, while subjective 
made by and 


relationships 
assessments 


assessments were observers 
participants aided by a check list. 

There tended to be agreement between the 
two different methods of measurement, not 
accounted for by mutual correlations with 
irrelevancies, suggesting that both objective 
and subjective measurements were concerned 
with the constructs. 

Observers and participants had difficulty in 
inferring private events that took place during 
the discussion. The difficulty of rating private 
events suggested that observers’ ratings are 
most appropriate for overt events. Yet the 
successful study of covert events requires 
reliance on objective measurement. 
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THE EFFECT OF EFFORT ON 


THE ATTRACTIVENESS OF 


REWARDED AND UNREWARDED STIMULI 


ELLIOT ARONSON 


Harvard University 


ESTINGER’S (1957; Festinger & Aronson, 

1960) theory of cognitive dissonance 

states that if a person simultaneously 
holds two cognitions that are psychologically 
inconsistent with each other (dissonance), he 
attempts, in some way, to make them more 
consistent (dissonance reduction). One deriva- 
tion from this theory is that if a person (or a 
fox) expends effort in an attempt to reach a 
goal (e.g., a bunch of grapes) and fails, he 
experiences dissonance. His cognition that he 
has tried hard to reach the goal is dissonant 
with his cognition that he did not reach it. 
Probably the simplest way for him to reduce 
dissonance in such a situation is by leaving the 
situation and convincing himself that he did 
not really want the goal in the first place (e.g., 
the grapes were sour anyway). But suppose he 
cannot readily leave the situation, and thus 
continues to expend effort in order to reach the 
goal. Each try he makes at attaining the goal 
strengthens his cognition that he wants the 
goal; thus an attempt to convince himself that 
he did not really want the goal becomes quite 
unrealistic and, therefore, is not an adequate 
way to reduce dissonance. Under these circum- 
stances, he must reduce dissonance by some- 
how justifying the expenditure of effort in 
spite of his failure to achieve his avowed 
purpose. That is, he must find something else 
in the situation to which he can attach value— 
a serendipiteous event. 

Thus dissonance theory leads to the follow- 
ing prediction: if a person continuously 
expends effort to attain a goal and is unsuccess- 
ful, stimuli associated with this experience 
become more attractive to him. This prediction 
_ This report is based upon a dissertation submitted 
in partial fulfillment of the requirement for the degree 
of Doctor of Philosophy at Stanford University in 
1959, and this research was partially supported by a 
grant from the National Science Foundation, ad- 
ministered by Leon Festinger. 

The author gratefully acknowledges his indebtedness 
to the chairman of his committee, Leon Festinger, and 
to Robert R. Sears and Douglas H. Lawrence who 
served as members of his committee. 
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appears to be contrary to inferences that might 
be drawn from the results of several experiments 
involving the concept of secondary reinforce- 
ment (Bugelski, 1938; Cowles, 1937; Melching, 
1954; Mitrano, 1939; Mote & Finger, 1942; 
Saltzman, 1949; Skinner, 1938; Wolfe, 1936). 
Generally, in these experiments, animals 
learned to associate previously neutral stimuli 
with the attainment of a reward. Subsequently, 
these stimuli came to function as rewards in 
and of themselves. 

These studies do not make clear exactly 
what processes are involved in secondary 
reinforcement. Strictly speaking, all that can 
be said is that a previously neutral stimulus 
temporarily serves the function of a reinforcer. 
It seems reasonable to infer, however, that 
through association with reward, the secondary 
reinforcer actually undergoes an increase in 
attractiveness. If this inference is correct, we 
have two theories that yield somewhat different 
predictions: reinforcement theory suggests that 
stimuli associated with reward gain in attrac- 
tiveness; dissonance theory suggests that 
stimuli associated with “no reward”’ gain in 
attractiveness, These predictions, although 
not mutually exclusive, can be discriminated. 

The crucial discriminating variable seems to 
be the expenditure of effort. The prediction 
from dissonance theory holds only if a person 
has expended effort in an attempt to attain the 
reward; if he has not expended effort there is 
no cognitive inconsistency, hence nothing to 
justify, hence no gain in the attractiveness of 
associated stimuli. Reinforcement theory sug- 
gests no such distinction—the implication 
being that stimuli associated with the attain- 
ment of a reward become more attractive to a 
person regardless of the effort involved. The 
following experiment was designed to investi- 
gate the inferences from these theories by 
systematically manipulating the attainment 
of reward and the expenditure of effort under 
controlled laboratory conditions. Our hypoth- 
esis is that, as effort increases, stimuli associ- 
ated with lack of reward become more attrac- 
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tive. To be more specific, since any effects 
due to secondary reinforcement should remain 
constant regardless of effort, stimuli associ- 
ated with lack of reward should become more 
attractive (relative to stimuli associated with 
reward) as effort increases. 


METHOD 


General Design of the Experiment 


To test the hypothesis, the following conditions 
were necessary: To have some subjects perform a 
difficult, effortful task, while others performed an easy, 
effortless task, in order to receive a reward; to have the 
reward occur only occasionally; and to provide a 
distinctive difference between the goal stimuli during 
the rewarded and nonrewarded trials. 

Subjects could then be asked to rate the relative 
attractiveness of the two goal stimuli prior to the 
experiment and again after having undergone the 
experiment. In this way it would be possible to ascertain 
the relative effects of reward and nonreward on attrac- 
tiveness as a function of the amount of effort expended 
during the performance of a task. 


Description of the Task 

The task involved “fishing” for containers (which 
were actually pocket-sized flashlights) in order to 
obtain money that was inside some of them. The bulbs 
and batteries were removed from the containers and 
small metal rings were inserted in the bulb sockets. 
The containers were identical in all respects except for 
color; some were red and some were green. Those of 
one color contained either two, three, or four dimes 
wrapped in tissue paper; those of the other color con- 
tained only tissue paper. The containers were arranged 
randomly along one wall of the experimental room, 
covered by a narrow strip of cardboard in such a 
manner that only the metal ring was visible, not the 
color. Thus the subjects could not see the color of a 
containe: until after they had removed it from under 
the cardboard covering. 

In the Easy condition, subjects were instructed to 
pull the containers out from under the cardboard 
covering by attracting the metal ring with a horseshoe 
magnet. This was an extremely easy task; subjects took 
an average of 14 seconds to pull out and open each 
container. In the Effortful condition, each subject 
was given a hook that was tied to one end of a string. 
The subjects were instructed to hold the string at a 
specified distance from the hook and “catch” the 
container by inserting the hook into the metal ring. 
Subjects in the Effortful condition took an average of 
52 seconds to pull out and open each container. 

For obvious reasons, it was necessary to disguise the 
purpose of the experiment. The investigator therefore 
introduced the experiment as an attempt to ascertain 
the effects of reward on fatigue. Subjects were told 
that in this experiment people were asked to perform 
one of several tasks that varied in their degree of 
tediousness. The experimenter explained that 50% 
of the subjects received a reward periodically during 
the performance of their task, while the other 50% 
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received nothing. “The purpose of the experiment js 
simply to determine whether getting a reward fo, 
doing a more or less tedious task wi!l counteract the 
effects of fatigue, and hence, result in more efficient 
performance on the part of the people who receive 4 
reward.” All subjects, regardless of condition, wer 
told that they were among the people who would be 
rewarded for their performance. They were told that 
several of the containers held money, and that the 
could keep all of the money they could find in them. 

In order to justify the fact that only containers oj 
one color would be rewarded, the experimenter informed 
the subjects that they would be rewarded according 
to a schedule of “differential reinforcement.” 

“That is,” the experimenter said, “there are two 

colors; red and green. In 85% of the containers of 

one color, some money has been placed; 15% are 
empty. In those of the other color, just the reverse 
is true; there is money in only 15% of them, while 

85% are empty. The reason for this is that we want 

to see how long it takes you to discover which color 

is more likely to contain a reward. We'll ask you 
about this after the experiment. Even after you are 

convinced that one color is more apt to contain a 

reward than the other, it is still to your advantage 

to examine the contents of the other, since 15% of 
them will contain money.” 
Actually, of course, all of the containers of one color 
contained money while none of those of the other color 
did. Subjects were given false ratios so that they would 
open all containers and thus be exposed to each for an 
approximately equal period of time. 

The subjects were instructed to open each container 
as soon as they got it, to empty the contents (if any 
and then to place the empty container through a slit 
into a box before going on to the next one. The box 
was used in order to limit the subject’s exposure to 
each color. 


Obtaining Measures of Color Preference 


In order to avoid contamination of the results, it 
was important to make it difficult for subjects to see 
the relationship between the fact that they were rating 
the color of the containers for attractiveness and the 
fact that some of the containers held money. This was 
accomplished in the following way. The experimenter 
stated that this was not his own piece of research. He 
informed the subject that he was being employed to 
run subjects for a psychologist at a neighboring univer- 
sity. To fortify this statement, the experimenter took 
great pains to appear nonchalant and almost bored in 
presenting the instructions. Somewhat later in the 
session, after the experimenter had finished describing 
the nature of the task and the schedule of reinforcement, 
he suddenly changed his demeanor and enthusiastically 
began to explain “an extremely interesting and exciting 
side-finding.”’ 

While conducting the experiment, I happened 

notice that it doesn’t take a person precisely the 

same amount of time to pull out each container; 
there is quite a bit of fluctuation. I wanted to see 
whether this fluctuation was just a matter of chance 
or whether it might be lawfully related to something 
else in the experiment which I could measure. What 
I uncovered was a very intriguing relationship. To 
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a large extent, fluctuations in a person’s performance 

time are a function of his color preference. It looks 

as though the experience of seeing one’s preferred 

color builds up a person’s motivation, or gives him a 

feeling of well being, or inspires him—I don’t know 

exactly what. But whatever it does, it leads to more 

efficient performance on the very next try. What I 

would like you to do, then, is to examine these two 

containers carefully and tell me which of the two 
colors you think is prettier. 
After the subjects inade their selection, the experi- 
menter showed them a scale and asked them to rate the 
relative magnitude of their preference from 0 (virtually 
equal in prettiness) to 5 (a very large difference in 
prettiness). 

It was necessary to obtain from each subject two 
ratings of the relative attractiveness of the colors—one 
immediately before the performance of the task and 
one immediately afterward. An important problem 
was justifying the necessity of the post-experimental 
rating without arousing the subjects’ suspicions con- 
cerning the true nature of the experiment. This was 
accomplished in the following manner. The experi- 
menter reminded the subject about the aspect of the 
experiment that he, personally, was interested in—the 
eflect of color preference on fluctuations in performance 
time. He explained, 

This is an extremely difficult thing to assess for the 

following reason. When people are first shown these 

containers, they are asked to rate which one they 

think is prettier. Chances are, they haven’t had a 

great deal of contact with these two specific colors 

prior to the experiment. Hence their rating of color 
preference is probably not a very stable one, since it 
is not based upon much experience. Then, after 
they’ve rated their preference, they spend the next 
several minutes coming into repeated contact with 
these colors—picking up the containers, handling 
them, opening them, closing them, etc. Due to this 
concentration of experience with the colors, most of 
the subjects undergo a shift in their color preference 
in the course of the experiment. This is a rough 
problem for me; since I’m trying to determine the 
effect of color preference on performance at every 
point in the experiment, it would be nice if I could 
assume that a person’s color preference remains 
constant at every point in the experiment. But I 
can’t assume this; the subject’s color preference is 
continually changing. I need some way of obtaining 
an estimate of each person’s color preference at 
every point in the experiment in order to get an 
accurate picture of its effect upon performance time. 

It wouldn’t be feasible for me to ask the subjects to 

rate their color preference every time they see one of 

the containers. This would be far too disconcerting. 

What I have been doing, with some success, is this. 

I ask the subjects to rate the colors again at the 

end of the experiment. I then compare their second 

rating with their first rating, and, by applying a 

mathematical formula, I can plot an estimate of 

each person’s color preference at every point in the 
experiment. In line with this, I’d like you to look at 
these colors again and tell me which one you think 
is prettier and by how much, according to this scale. 
Try not to think of your previous rating. That is, 


don’t consciously try to give the same rating you 

gave the last time just to show me what a consistent 

person you are. On the other hand, don’t consciously 

attempt to report a change if there was no change. 

After the subjects had given their post-experimental 
rating, they were asked to account for any shift in their 
initial rating. Virtually all subjects referred to the hue 
and brightness of the object itself. For example, ““The 
green simply looks much brighter than it did before’’; 
or “The red looks much richer.” 


Subjects and Design 

The subjects used in the experiment were 60 under- 
graduate women.’ In order to be certain that money 
would be an important reward, the experimenter 
recruited the subjects from the university employment 
service and informed them that the money which they 
found in the containers would constitute their salary. 
Thirty subjects were tested in each condition. The 
rewarded and nonrewarded colors were counter- 
balanced. 

The instructions given to the subjects in the Easy 
and Effortful conditions were identical except for the 
description of the tasks. In the Effortful condition, 
subjects were told that theirs was an extremely tedious 
task; in the Easy condition, subjects were told that 
theirs was not a particularly tedious task. The task was 
then described and demonstrated. 

There were a total of 51 containers in the population 
from which each subject fished. Seventeen of these 
contained money while 34 were empty. All subjects 
were told that they would be working for a limited, 
but unspecified period of time. Hence the more quickly 
they worked, the more money they would have a chance 
to make. Actually, they were allowed to work until 
they had fished out 16 nonrewarded containers. Since 
the population consisted of twice as many nonrewarded 
as rewarded containers, each subject received approxi- 
mately eight rewarded containers. There was an 
average of $.25 in the rewarded containers; thus each 
subject earned approximately $2.00. 

At the close of the session, the experimenter ex- 
plained the true nature of the experiment to each 
subject and discussed the need for the deception. 


RESULTS AND DISCUSSION 
For each subject, a score was calculated 
to indicate the change in her ratings of the 


?The number of subjects actually tested was 68. 
Three of the subjects were dropped from the experiment 
because they indicated that they had guessed the 
purpose of the experiment. That is, they suspected that 
the presence of money in the containers was supposed 
to influence the perceived attractiveness of the colors. 
Two of these were in the Effortful condition, one was in 
the Easy condition. In addition, prior to the experiment, 
the experimenter decided to discount the results of all 
subjects whose initial color preference was four or 
higher. Such an extreme initial preference meant, for 
all practical purposes, that any subsequent shift in color 
preference would be limited to the direction of the 
initially unpreferred color. The results of five subjects 
were discarded for this reason; two were in the Effortful 
condition, three were in the Easy condition. 





378 


TABLE 1 
MEANS AND STANDARD DEVIATIONS OF THE SHIFT 
IN RATED ATTRACTIVENESS OF COLORS 


Easy Effortful 

i e/6/]_E] 2] & 

Se [cE € | Sel cE! E 

5* 2 = 5 3 eFilge S 
Initial preference | M +.21| +.88) +.52) —.71/+.17 | —.30 
for rewarded color | 7 1.49 79, 1.24) 1.06) .85 1.07 

N 7 6 13 7 6 13 
Initial preference | M +1.19/+1.42 +1.31 12\}+.40 | +.27 
for nonrewarded | ¢ 1.22 76, 1.03 -82 S4 70 

color N 8 9 17 8 9 17 
Combined initial | M  +.73/4+1.20) +.97) —.27/+.31 | +.02 
preference o 1.42) 81; 1.18) 1.03) .69 92 

N 15 15 30 15 15 30 





Note.—A plus sign indicates a shift in the direction of the 
rewarded color; a minus sign indicates a shift in the direction of 


the nonrewarded color. 


relative attractiveness of the colors. The 
ratings can be considered as points on a con- 
tinuum from “Red 5” to “Green 5,” with a 
midpoint of 0. If the change was in the direc- 
tion of the rewarded color, it was given a plus 
sign; if the change was in the direction of the 
nonrewarded color, it was given a minus sign. 
For example, if green was rewarded, and a 
subject’s pre-experimental rating was “Red 2” 
and her post-experimental rating was “Red 
1,” this is a change of one unit in the direction 
of the rewarded color and, therefore, her score 
would be +1. 

The means and standard deviations of these 
scores for the subjects in the Easy and Effort- 
ful conditions are presented in Table 1, sepa- 
rately for those subjects who initially preferred 
the rewarded color and for those who initially 
preferred the unrewarded color. An examina- 
tion of the table shows that in the Easy con- 
dition, subjects shifted their color preferences 
in the direction of the rewarded color. In the 
Effortful condition, however, no such shift 
occurred, This general trend can be seen most 
clearly when one examines the combined 
initial preferences. The significance of this 
difference was tested by an analysis of variance. 
This technique was used for the following 
reason. Table 1 shows that an individual’s 
initial color preference was a source of vari- 
ability; that is, within a given condition, 
those subjects who originally preferred the 
nonrewarded color tended to shift their pre- 
ference more toward the rewarded color than 
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did those who initially preferred the rewarded 
color. An analysis of variance was used jp 
order to eliminate the effects of this known 
source of variance. 

Separate analyses of variance were computed 
for the red rewarded and green rewarded 
situations. In both situations, the interaction 
variance was considerably smaller than the 
within cells variance. The F ratios, therefore. 
were computed by dividing the variance 
between conditions by the variance within 
cells. With red rewarded, F = 9.64. With 1 
and 26 degrees of freedom, this ratio exceeds 
the .01 level of significance. With green re- 
warded, F = 4.84. With 1 and 26 degrees of 
freedom, this ratio exceeds the .05 level of 
significance. 

This finding suggests that in the Easy con- 
dition, secondary reinforcement was operating 
to increase the attractiveness of the rewarded 
color. Why was there no such increase in the 
Effortful condition? Does secondary reinforce 
ment not operate when effort is involved? 
There appears to be no reason to assume this. 
Our interpretation of the data is that in both 
the Easy and the Effortful conditions, there 
was an effect due to secondary reinforcement. 
This resulted in an increase in the perceived 
attractiveness of the rewarded color. In the 
Effortful condition, however, dissonance was 
also involved. In this condition, every time 
the subjects obtained a nonrewarded container, 
their cognition that the container held no 
money was dissonant with their cognition 
that they had worked hard to obtain it. By 
finding something about the empty container 
to which they could attach value, they could 
reduce dissonance by justifying the expendi- 
ture of effort for obtaining an otherwise worth- 
less object. The rewarded and nonrewarded 
containers were identical in all respects except 
for their color. Thus a convenient way for the 
subjects to reduce dissonance was by convinc- 
ing themselves that the nonrewarded color 
was more attractive than they had previously 
thought. In the Effortful condition, then, it 
appears that two opposing forces were in 
operation; one enhancing the value of the 
rewarded color, the other enhancing the value 
of the nonrewarded color. In the Easy condi- 
tion subjects expended very little effort in 
performing the task; consequently, there was 
little or no dissonance involved when they 
obtained an unrewarded container. Thus 
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subjects in the Easy condition found the 
rewarded color to be more attractive. 


Some Theoretical and Methodological Problems 


Although the differences between experi- 
mental conditions occurred as predicted, it is 
somewhat inelegant to explain the fact that 
there was no change in the Effortful condition 
by positing two equal and opposing forces. 
The results would have been more striking if, 
in the Effortful condition, the scores had been 
markedly in the direction of the nonrewarded 
color. Perhaps this effect could have been 
achieved by reducing the size of the reward; 
this would have reduced the attractiveness of 
the secondary reinforcer, and hence, might 
have resulted in a greater relative gain in the 
attractiveness of the unrewarded color. But 
decreasing the size of the reward might have 
decreased the amount of dissonance also. For 
example, a person probably would experience 
less dissonance if he failed to obtain $.01 than 
if he failed to obtain $10.00. 

A possible alternative method of bringing 
about a greater increase in the attractiveness 
of the nonrewarded color might be to increase 
the amount of effort involved in the Effortful 
condition. That is, a greater expenditure of 
effort in vain would probably result in more 
dissonance and hence a greater gain in the 
rated attractiveness of the nonrewarded color. 
But, by doing this, one might also introduce 
dissonance into the rewarded trials; the subject 
might feel that obtaining $.25 is not worth 
all the effort. If this occurred, it would tend to 
increase the rated attractiveness of the re- 
warded color as well as that of the nonre- 
warded color. It must be emphasized that these 
speculations, as yet, have not been tested. 
Nevertheless, it seems reasonable to assume 
that in this design the effects of dissonance 
and secondary reinforcement are interdepen- 
dent; attempting to manipulate one would 
probably affect the other. 


Competence—An Alternative Explanation 


The results of this experiment can alter- 
tatively be interpreted in terms of White’s 
(1959) concept of competence. Since the task 
a the Effortful condition was a difficult one, 
itmay conceivably have served as a challenge 
to the subjects in that condition. Thus each 
me they succeeded in hooking a container, 
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they may have experienced a feeling of efficacy 
or mastery. This positive feeling may have 
then generalized to the containers—red and 
green alike. The feeling of efficacy may have 
been so great that, by comparison, the positive 
effect due to the presence of money may have 
been negligible. That is, although the presence 
of money served to make the rewarded color 
more attractive in the Easy condition, its 
presence in the Effortful condition may have 
been too small to be apparent in the face of 
the effect of a very great feeling of efficacy. 
Such an interpretation would account for the 
fact that the relative attractiveness of the two 
colors remained virtually constant in the 
Effortful condition, while in the Easy condi- 
tion, there was a significant shift in the direc- 
tion of the rewarded color. 

There is evidence, however, that casts some 
doubt upon this interpretation. If the effect of 
competence in the Effortful condition had 
been great enough to dwarf the effect of secon- 
dary reinforcement, one might expect subjects 
in the Effortful condition to have derived 
more enjoyment from their task than subjects 
in the Easy condition. Some data were gathered 
that are relevant to this problem. After per- 
forming the task, each subject was asked to 
rate his feelings about the task on an 11-point 
scale, from 0 (it was an extremely dull, un- 
enjoyable task) to 10 (it was an extremely 
interesting, enjoyable task), The mean rating 
of subjects in the Effortful condition was 3.91, 
while that of the subjects in the Easy condition 
was 3.27. Thus both tasks were considered 
about equally dull. Although negative evi- 
dence is not conclusive, it nevertheless weakens 
an interpretation of the results in terms of 
competence. 





Secondary Reinforcement or Scarcity 

The experiment was designed in such a way 
that each subject received about twice as many 
nonrewarded as rewarded containers. The 
reason for this imbalance was to increase the 
number of “dissonant” experiences for each 
subject. But the fact that the rewarded color 
was also the scarcer color provides a source of 
ambiguity in the interpretation of the results. 
In the Easy condition, the rewarded color may 
conceivably have undergone a gain in attrac- 
tiveness—not because it contained a reward, 
but because people, perhaps, have a tendency 
to value things that are scarce in the environ- 
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ment. To eliminate this ambiguity, 60 addi- 
tional subjects were run under exactly the 
same conditions, except that there was money 
in every one of the containers. If the results 
in the Easy condition had been a function 
of scarcity rather than secondary reinforce- 
ment, one would expect the same gain in the 
attractiveness of the scarcer color under con- 
ditions of equal reward. No such gain occurred. 
In both the Easy and the Effortful conditions, 
there was a very slight change in the direction 
of the scarcer color. The mean of the change 
in the relative attractiveness of the two colors 
in the Effortful condition was .02, while in 
the Easy condition it was .13. Therefore, the 
results previously reported cannot be attrib- 
uted to the effects of scarcity. The results 
indicate that, at least with human subjects, 
secondary reinforcement does involve a gain 
in the attractiveness of a previously neutral 
stimulus. 


SUMMARY AND CONCLUSIONS 


A laboratory experiment was conducted 
which demonstrated that the degree of effort 
a person expends in attempting to achieve a 
reward has an effect on the relative attractive- 
ness of stimuli associated with rewarded trials 
and stimuli with nonrewarded 
trials. The hypothesis was that, as effort in- 
creases, there is a corresponding increase in 
the relative attractiveness of stimuli associated 
with nonrewarded trials. 

To test the hypothesis, the experimenter 
had subjects perform a task that involved 
fishing for containers in order to obtain money 
that was inside some of them. For some sub- 
jects, the task was made easy (Easy condi- 
tion); for others, the task was made effortful 
(Effortful condition). The rewarded containers 
were of a different color from that of the 


associated 
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nonrewarded containers. Subjects were asked 
to rate the relative attractiveness of the two 
colors before and after performing the task. 

The results were in accord with the predic. 
tion. In the Easy condition, there was an 
increase in the relative attractiveness of the 
rewarded color. In the Effortful condition, 
however, there was no change in the relative 
attractiveness of the two colors. The difference 
between the subjects’ ratings in the two con- 
ditions was highly significant. The results were 
interpreted in terms of an interaction between 
the effects of cognitive dissonance and those of 
secondary reinforcement. 
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THE STIMULATING VERSUS CATHARTIC EFFECTS OF A 
VICARIOUS AGGRESSIVE ACTIVITY 


SEYMOUR F 


ESHBACH 


University of Pennsylvania 


HE present study is concerned with the 
complex effects of participation in a 
presumably vicarious aggressive activity 
upon subsequent aggressive behavior. A num- 
ber of studies have demonstrated that the 
expression of aggression—whether directly or 
in symbolic form—results in a lowering of 
subsequent aggression (Berkowitz, 1960; 
Feshbach, 1955; Pepitone & Reichling, 1955; 
Rosenbaum & de Charms, 1960; Thibaut & 
Coules, 1952). However, there is also experi- 
mental evidence to the effect that aggressive 
activity has a stimulating effect upon the 
manifestation of other aggressive acts (Fesh- 
bach, 1956; Kenny, 1953); that is, aggression 
may breed aggression. 

Since both possibilities—reduction and stim- 
wation—have been experimentally observed, 
the pertinent issue then is under what con- 
ditions a vicarious aggressive act increases 
and under what conditions it decreases the 
probability of subsequent aggressive behavior. 
One such condition suggested by differences 
in procedure between the studies that ob- 
tained evidence of a cathartic effect and those 
demonstrating a stimulating effect is the emo- 
tional state of the subject at the time the 
aggressive act is performed; that is, if the 
subject is angry at the time he engages in the 
aggressive activity, he can then use the act 
to satisfy and thereby reduce his hostility. 

The general hypothesis is suggested that in 
order for an activity to have drive reducing 
properties, components of the drive must be 
present or evoked during performance of the 
activity; that is, there must be some func- 
tional connection between the vicarious act 
and the original drive instigating conditions. 
While it is undoubtedly true that the vicissi- 
tudes of life will arouse hostilities that cannot 
be directly discharged, it does not follow 
that any indirect aggressive act will have the 
property of reducing hostility that has been 
evoked under markedly different circum- 
stances. According to the present view, a 
child’s anger toward its mother will not be 
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reduced by an aggressive act toward a doll 
figure unless its anger toward the mother is 
aroused when the aggressive act is performed. 
The evocation of anger may not be a sufficient 
condition—the doll figure may have to be 
similar to the mother—but it is probably a 
necessary condition for drive reduction to take 
place. 

If the subject is not hostile at the time of 
participating in an aggressive act, his sub- 
sequent aggressive behavior will not merely 
remain unaffected but is very likely to be 
increased. An increase in aggression following 
a vicarious aggressive act could result from a 
number of different processes: a reduction 
in inhibition or aggression anxiety, reinforce- 
ment of aggressive responses, and finally 
conditioned stimulation of aggressive drive 
and/or aggressive responses. ‘ 

On the basis of the foregoing considerations, 
the following hypotheses are proposed: Partici- 
pation in a vicarious aggressive act results in 
a reduction in subsequent aggressive behavior 
if aggressive drive has been aroused at the 
time of such participation; if aggressive drive 
has not been aroused at the time of participa- 
tion in a vicarious aggressive act, such partici- 
pation results in an increase in subsequent 
aggressive behavior. . 


METHOD 


The experimental procedure consisted of arousing a 
subject’s aggressive drive before participation in a 
vicarious aggressive act or before participation in a 
neutral act and then obtaining measures of aggression 
subsequent to these interpolated activities. The varia- 
tion in level of aggression was accomplished by means 
of an insult versus noninsult condition and the variation 
in the interpolated activity consisted of exposure to a 
fight film versus a neutral film. 


Subjects 


The subjects were male college student volunteers 
who were assigned at random to one of the four treat- 
ment groups generated by the two experimental vari- 
ables. One hundred and one subjects were used in the 
study, with approximately equal numbers in each 
experimental condition. The subjects were seen in 
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all groups by the experimenter so that nine experi 
mental sessions in all were held, three for the Noninsult 
Fight Film condition and two sessions for each of the 


other experimental conditions 


Procedure 


Insult versus Noninsuli condition. Subjects assigned 
to the insult groups were subjected to a number of 
and extremely critical remarks. These 
comments essentially disparaged the intellectual 
motivation and the emotional maturity of the students." 
Previous studies (Feshbach, 1955; Gellerman, 1956) 
have provided abundant evidence that this technique 
successfully arouses hostility toward an experimenter 
Subjects assigned to the Noninsult condition were given 


unwarranted 


standard test instructions 

Ageressive Film 
Subjects in the Insult and Noninsult groups then 
witnessed either a Fight Film or a Neutral Film. The 
Fight Film consisted of a film clip of a rather exciting 
prize fight sequence taken from the motion picture 
Body and Sou while the neutral film depicted the 


The 


versus Neutral Film condition 


consequence of the spread of rumors in a factory 


duration of each of the films was approximately 10 
minutes 
As a rationale for the presentation of the film, 


the subjects were told before the film was presented 
that they would be asked to judge the personality of 
the main character in the film. Following the completion 
of the film, each subject indicated his impression of the 
personality of the hero of the film on a questionnaire 
provided for that purpose. 

Dependent Measures of 
given a modified word association test which, in a 
previous study (Gellerman, 1956) had been shown to 
be sensitive to differences in the arousal of aggression. 
The test involves the presentation of five aggressive 
words interspersed among six neutral stimuli as follows 
walk, murder, relax, 


asked to 


{egression. All subjects were 


wash, choke, travel, 
stab, sleep, torture, listen 
give in written form a series of associations to each 
word. The stimuli are presented orally and also visually, 
the experimenter holding up a 5” X 8” card on which 
rhe subject’s Aggression 


massacre, 
The subjects are 


the stimulus word is printed 
score is based on the number of aggressive word asso 
ciations among the first 10 responses to each of the 
aggressive stimulus words. The maximum score that 
can therefore be obtained on this measure is 50 

Subsequent to the administration of the 
association test, the first experimenter left the room 
second 


word 


having presumably completed the study. A 
experimenter then entered and informed the subjects 
that the psychology department wished to 
students’ opinions of the value of participating in 
psychological experiments. A questionnaire was then 
administered dealing with the subjects’ attitudes toward 
the experimenter and with their evaluation of the 
experiment. The questionnaire which consists of six 
items, each of which has six alternatives, is described in 
more detail in a previous study (Feshbach, 1955). It is 


assess 


gratitude to 
courage in 


his 
and 


'The author wishes to 
Abraham Wolf for his 
carrying out this phase of the experiment 


express 
competence 
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scored so that the least aggressive choice for a particular 
item is given a score of 1 and the most aggressive chox 
a score of 6 


RESULTS 


By hypothesis, it was predicted that the 
Insult group exposed to the Fight Film would 
manifest Jess subsequent aggression on each 
of the two measures of aggression than the 
Insult group exposed to the Neutral Film 
while the Noninsult group exposed to the 
Fight Film would display more subsequent 
aggression than the Noninsult group exposed 
to the Neutral Film. The word association 
data bearing upon these predictions are pre- 
sented in Table 1. The mean differences are in 
accordance with expectation, the Insult-Fight 
(IF) Film group responding with fewer ag- 
gressive associations than the Insult-Neutral 
(IN) Film group and the Noninsult-Fight 
(NIF) Film group responding with more 
aggressive associations than the Noninsult- 
Neutral (NIN) Film group. The results of an 
analysis of variance of the data indicate that 
the interaction between the Insult and the 
Film variable is statistically significant. The 
difference between the IF Film and the IN 
Film groups falls short of the 5% confidence 
level, the value of ¢ being 1.9. The difference 
between the NIF Film and NIN Film groups 


TABLE 1 

A. MEAN AGGRESSIVE Worp ASSOCIATION 

RESPONSES OBTAINED UNDER EACH 
EXPERIMENTAL CONDITION 


Film (F) 
Drive (D —E 
Fight (V) Neutral (NV) 
Insult 24.5 (25)* 28.9 (21) 
Noninsult 27.7 (29) 25.3 (25) 


B. SuMMARY OF ANALYSIS OF VARIANCE OF 
AGGRESSIVE WoRD ASSOCIATION RESPONSES 


Source SS df MS F 

D 8.93 1 8.93 

F 38.43 1 38.43 

DF 291.59 1 291.59 | 4.58* 

Within | 6,111.80 96 63.66 | 
Total | 6,450.75 99 


® The word associations of one subject were not scored due to 
illegibility. 
*p < .05. 
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ViIcARIOUS AGGRESSION: STIMULATION OR 


TABLE 2 
DIsTRIBUTION OF AGGRESSIVE WoRD AsSOCIATION 
RESPONSES FALLING ABOVE AND BELOW 
THE MEDIAN AS A FUNCTION OF INSULT 
Ficnt Frtm AND INsuLT NEUTRAL FILM 
TREATMENTS 





Treatment <27 >27 
Insult Fight Film 17 g 
Insult Neutral Film 10 19 
' 
Note.—x? = 6.02; » < .05 
TABLE 3 


A COMPARISON OF MEAN SCORES ON THE 
AGGRESSION QUESTIONNAIRE 


Insult-Fight Insult-Neu- Noninsult- _—— 
(IF tral (IN) Fight (NIF) NIN) 

(N = 26) (N = 29) (N = 20%) W = 3) 
M 14.6 19.5 13.7 15.0 
e 3.72 3.90 2.52 2.95 


Note.—IF-—IN = 4.7; » < .01. 
* One subject failed to complete questionnaire 


yields a ¢ value of approximately 1 which is 
clearly not significant. 

The contrast between the IF Film and IN 
Film groups is more sharply delineated by a 
simple median split. The chi square for the 
fourfold table presented in Table 2 is 6.02 
which yields a p value of <.02. The word 
association data, then, indicate that under 
conditions of anger-arousal, witnessing a 
fight film results in a lowering of aggression. 
However, the hypothesized stimulating effect 
of an aggressive film under nonaroused condi- 
tions is not borne out by the data. 

The questionnaire data are presented in 
Table 3. Because of the lack of homogeneity 
of variance between the IN and NIF Film 
groups, separate comparisons were made 
between pertinent groups and, in these com- 
parisons, the variances of the respective dis- 
tributions are not reliably different. As was 
the case with the word association data, the 
IF Film group displays significantly less ag- 
gression on the questionnaire than does the 
IN group. The difference between the Non- 
insult groups is not in the predicted direction 
but is small and unreliable. 

The difference in subsequent aggressiv: 
attitudes between the insulted group exposed 
to the fight film and the insulted group exposed 
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TABLE 4 
DISTRIBUTION OF AGGRESSION QUESTIONNAIRE 
SCORES I ALLING ABOVE AND BELOW THE 
MEDIAN AS A FUNCTION OF INSULT 
Ficut Firm anp INsutt NEUTRAL 
Firm TREATMENTS 


Treatment <17.5 


Insult Fight Film 
Insult Neutral Film | 7 2 


Note.— x? = 15.1; » < .001. 

to the neutral film is further illustrated by a 
simple median split. The chi square for the 
fourfold table presented in Table 4 is 15.1, 
which is significant at less than the .001 level. 


DISCUSSION 


The experimental results are consistent 
with the hypothesis that the drive reducing 
effect of a vicarious aggressive act is dependent 
upon the aggressive state of the subject at the 
time of the vicarious aggressive activity. 
Witnessing the prize fight film resulted in a 
significant relative decrement in aggression 
in comparison to witnessing the neutral film 
only for those subjects in whom aggression 
had been previously aroused by the insulting 
comments of the experimenter. The predicted 
increase in aggression for the noninsulted 
subjects following exposure to the fight film 
did not occur, however. Each of these two 
outcomes warrants further comment. 

With regard to the difference between the 
two Insult groups in subsequent aggression, a 
possible alternative to a catharsis or drive 
reduction hypothesis is one that assumes that 
guilt or revulsion stimulated by the fight 
film is the primary mechanism responsible for 
the lowered aggression. Berkowitz (1958, 
1960) has strongly argued for such an explana- 
tion of a reduction in aggressive behavior 
following an aggressive act. However, it must 
be noted that the evidence for a guilt or in- 
hibition process is most indirect and inferential. 

With regard to the present study, the guilt 
alternative is certainly possible, although, 
for various reasons to be suggested below, not 
a likely one. If guilt arousal were a ubiquitous 
process, occurring whenever people are given 
the opportunity to indulge in aggressive 
fantasies, then the fight film should similarly 
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have inhibited the aggressive response output 
of the Noninsult group. The possibility still 
remains that guilt arousal can account for the 
aggression reducing effects of fantasy under 
conditions where aggression has recently been 
stimulated, as in the Insult condition. As a 
check on whether the lowered aggression of the 
IF Film group was due to some inhibitory 
factor, the word associations were scored for 
defensiveness. A previous study of the effects 
of inhibition upon aggressive word associations 
has shown that when inhibition is experimen- 
tally aroused, the number of aggressive re- 
sponses decreases while the number of defen- 
sive responses increases (Gellerman, 1956). 
While, in the present study, a difference was 
observed in the number of aggressive associa- 
tions, the difference between the two Insult 
groups in the number of defensive associations 
was negligible and insignificant. The absence 
of an increment in defensive responses, while 
not decisive since the experiment cited em- 
ployed an inhibition procedure more closely 
resembling fear rather than guilt, is more 
consistent with a drive reduction rather than 
guilt explanation of the decrease in aggression 
following the exposure of the insulted subjects 
to the Fight Film. 

rhe problem remains of accounting for the 
failure to obtain the expected increase in 
aggression in the Noninsult group. One possible 
reason is the limitation of the questionnaire 
instrument as a measure of aggression, Al- 
though one’s preference for or attitude toward 
another person is frequently used as an index of 
aggression, as was the case in the present ex- 
periment, dislike are not 
equivalent dimensions. At the extreme, aver- 
sion and aggression are likely to be strongly 
correlated but moderate ranges of 
feeling, the association between dislike and 


and aggression 


within 


aggression may well be negligible. For this 
reason, the word association measure is prob- 
ably a better instrument than the attitude 
questionnaire for detecting changes in aggres- 
sion in the noninsulted groups. However, al- 
though the relative increment in aggressive 
associations following exposure of the Non- 
insult group to the Fight Film was in the pre 
not 
failure to 


statistically 


obtain 


direction, it was 
significant. Whether 
evidence of a stimulating effect of a vicarious 


emotional 


dicted 
this 


aggressive activity under relaxed 
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conditions is due to inadequacies in the theo. 
retical analysis or to limitations in the methods 
utilized cannot be ascertained from the present 
data. 

On the other hand, the data consistently 
reflect the dependence of the drive reduction 
effect upon the arousal of aggression at the 
time the subject is engaging in the vicarious 
aggressive activity. Presumably vicarious 
aggressive acts do not willy-nilly serve as 
outlets for aggressive motivation. This latter 
process warrants further attention. Aggression 
is not an ever-present tension system pervading 
all of an individual’s activities. Like other 
acquired motives, its appearance is very much 
dependent upon situational factors; and, the 
more specific the category of objects toward 
which the aggression is directed, the narrower 
is both the range of stimuli that can elicit 
the motivation and the range of situations 
that can serve as substitute outlets for the 
aggression. 

What would appear to be a relatively simple 
matter—the effects of a vicarious aggressive 
activity upon subsequent aggressive behavior 

-is in actuality a quite complex process. The 
present study has examined the influence of 
the drive state of the organism upon this 
process. Beyond the requirement of replication 
in a variety of situations, further research is 
needed to establish the extent to which other 
variables determine the effects of so-called 
vicarious aggressive activities and to establish 
the precise mechanism by which the per- 
formance, direct or vicarious; of an aggressive 
act influences subsequent aggressive behavior. 

SUMMARY 

Studies of the effects of a presumably vicar- 
ious activity upon subsequent 
aggressive activity suggest that under certain 
conditions the activity will tend to increase, 
and under other conditions decrease, the prob- 


aggressive 


ability of subsequent aggressive be havior. 
The purpose of this experiment was to study 
the effects of one such condition—namely, the 
emotional state of the subject at the time the 
vicarious aggressive activity is performed. 
Specifically, it was proposed that a vicarious 
aggressive activity results in a reduction in 
subsequent aggressive behavior if the subject 
is emotionally aroused at the time he is en- 


" 


gaging in this activity, but if anger has not 
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been aroused, the activity results in an increase 
in subsequent aggressive behavior. The two 
independent variables manipulated in the 
study consisted of an Insult versus Noninsult 
condition and an Aggressive Film versus 
Neutral Film condition. One hundred and one 
college students were assigned at random to the 
four treatment groups generated by the two 
experimental variables. The subjects met the 
experimenter in small groups so that nine 
experimental sessions in all were held. Sub- 
jects assigned to the Noninsult condition were 
given standard test instructions while sub- 
jects in the Insult groups were subjected to a 
number of unwarranted and extremely critical 
remarks. The subjects then witnessed either 
an Aggressive Film or a Neutral Film. The 
former consisted of a film clip depicting a prize 
fight sequence while the latter depicted the 
consequences of the spread of rumors in a 
factory. They were then administered a word 
association test and under the guise of a 
departmental assessment of the value of 
students’ serving as experimental subjects, a 
second experimenter administered a question- 
naire dealing with the subjects’ attitudes 
toward the first experimenter and with their 
evaluation of the experiment. The degree of 
aggression manifested on the attitude question- 
naire and the number of aggressive responses 
on the word association test constituted the 
dependent measures. 

A significant interaction in the predicted 
direction was obtained for the Word Associa- 
tion measure—the Insult-Aggressive Film 


group responding with fewer aggressive associa- 
tions than the IN Film group, and the Non- 
insult-Aggressive Film group responding with 
more aggressive associations than the NIN 


Film group. A similar significant difference 
between the two Insult groups was found on 
the attitude questionnaire, but the difference 
between the two Noninsult groups on this 
measure was unreliable and was not in the 
predicted direction. 

The results were interpreted as being con- 
sistent with a drive reduction theory, although 
an inhibitory process (guilt arousal) cannot be 
excluded by the evidence at hand. The de- 
pendence of the aggression reducing effect of 
exposure to a film depicting violent activity 
upon the prior or simultaneous arousal of 
aggressive drive was stressed. 
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THE INFLUENCE OF LANGUAGE ON THE DEVELOPMENT OF 
CONCEPT FORMATION IN DEAF CHILDREN! 
HANS G. FURTH?® 


Catholic University of America 


TTEMPTS to appraise the contribution 
of language® in the development of 
thinking are made difficult by the 

fact that ordinarily language and thinking 
develop together. However, children born 
with profound deafness, or afflicted with it 
at a very early age, do not learn ordinary 
language as the usual by-product of living; 
through them we may study the influence of 
language deficiency on the development of 


cognitive functions and clarify the role of 


in cognition. There appears to be 


language 


wide agreement t t the average deaf person 
is inferior to hearing persons in all activities 
requiring thinking in abstract terms (Levine, 


1950). Ewing and Stanton (1943), Templin 
(1950), Myklebust and Brutten (1953), and 
Oléron (1953) among others have held that 
the “conceptual retardation” of deaf people 


is intrinsically related to their lack of language 
experience. 

Che purpose of this 
that the capacity of deaf people to deal with 


study is to demonstrate 
conceptual tasks may not in fact be generally 
retarded or impaired. Cognitive capacity, it 
is proposed, develops naturally with living, 
whether or not spoken language is part of 
the child’s experience, and the role of language 
: e.g., familiarity with 
certain words may increase the efficiency with 
which the solution of certain problems may be 
reached. It follows that deaf children should 
not differ from hearing children in their per- 


is restricted and extrinsi 


formance on conceptual learning tasks where 
it can be assumed that specific language ex 
favor the hearing child 


per ience does not 


However, on conceptual tasks where one has 
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reason to assume that specific language famili- 
arity gives the hearing child an advantage, 
deaf children should be inferior to hearing 
children. 

Following a lead of Levy and Cuddy (1956), 
three concept learning tasks, differing in 
relation to the language repertoire of the two 
samples, were chosen to measure the basic 
cognitive potential of the children and estab- 
lish norms for various age groups. The tasks 
were nonverbal and consisted of the opera- 
tional attainment of a concept or a principle 
according to which a subject’s choice could 
be consistently correct. For the first two tasks 
the 
assumed to be as familiar or unfamiliar to the 


correct principle or concept could be 
deaf as to the hearing children. The idea ol 
’ involved in the first task, 


workers with deaf children 


is SO primi- 
tive report 
that there is no deaf child in school who does 
not have at least 

On the basis of Levy and Ridderheim’s (1958 


that 


seems tna 


“same,” 
that 
some gesture for this idea 
study, partially replicated here, it 
the age of 12 do not 
ready 


also should have no 


hearing children before 
have the concept of “symmetry” 
available and, therefore, 
advantage over the deaf children on the Sym- 
metry task. 

The concept of 
should be quite familiar to hearing children 
beyond 6 years old, as a study of Kreezer and 
Dallenbach (1929) showed. Our language 
employs many dimensions in terms of oppo- 
Asarulea 


“opposition,” however, 


sites: hot-cold, good-bad, long-short. 
child learns the words denoting extremes before 
he learns the words characterizing the dimen- 


sion as such. In such linguistic contexts a child 


becomes naturally acquainted with the concept 
9 soo } +h 
ot opposition and when he has reached ine 


age of 6 can readily grasp the meaning of the 

word “‘opposite.”’ In distinction from the hear- 
hild, without the benefit of 

this specific language experience, finds it 

eh 


ing child, the deaf « 


reia- 


tively difficult to learn the concept of opposite 
7 . : } ] th 
for one dimension and then generalize the 
concept to other dimensions f th 
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LANGUAGI 


their pupils are being 


that 
“taught” the meaning of “opposite” when they 
are in the intermediate grades or about 14 


jeaf report 


years old. 
METHOD 
Subjects 


All pupils aged 7 
for the deaf located in one state, 


12 years from the three schools 
with an additional 44 
subjects from a neighboring state, made up the deaf 
Excluded were those few who did not have an 
least 50 decibels in the better 


sample 
arly hearing loss of at 
re were 30 subjects in each of the six age groups. 
the deaf sample consisted of 
ractically all deaf children of the desired age living in 


Since the majority of 


the state at one time, it can be reasonably assumed that 


a relatively unbiased and representative sample of deaf 
children was achieved. The hearing sample consisted of 
180 subjects randomly selected from five different grade 
schools and classified into six age groups, 7-12 years, 


ple had an equal 


The 


vs and girls; however 


f 30 subjects each hearing san 


mber of the deaf sample at 


ges 7,9. and 12 showed a slight imbalance in favor of 
[The procedure of matching the hearing and deaf 
group only on age and sex and permitting other possibly 
rtinent variables to vary randomly was judged 
erior an attempted control of IQ, institutional 
cation, etc. With regard to IQ in particular, it is 
wn that tests, standardized on a hearing sample and 
ased on an average experiential and educational 
wkeround, are of questi onable value with deaf 
hildren. The deaf group had a somewhat lower overall 
socioeconomic rating than the particular hearing sample 


ly. In the light of the reported relation 





ship between the intelligence of children and parental 
& Merrill, 1937), the 


distribution in the two samples 


somewhat 


nation 


Terman 


ven socioeconomic 


vould bias the results, if at all, against the deaf and, 
therefore, against the main hypothesis 

— 

Tasks 


The three tasks employed in the study were thes« 
1. Sameness task. This task consisted of a series of 40 
+ ' f " 


liferent pairs of round tin covers with two simple 
each cover. The two figures on one of 
entical, on the other the were 


he cover with the identical figures a 


hgures drawt 
the covers wer two 


itierent. Under t 


checker was placed indicating to the child the correct 
NnOICE The criterion for success was 10 consecutive 
wrect choices. The trials were terminated after the 
riterion was reached, or with the first error after Trial 
0. A modified stimulus presentation apparatus was 
se ) s the following tas 
2. Symmeiry task. Forty different pairs of 7 X 9inch 
ards were prepared for this task and simple figures were 
tawn in heavy black ink on a white background. On 
¢ card of each pair the figure to be rewarded was a 
symmetrical one, while on the other it was asymmetri- 
The edure and criterion were the same as those 
sed with the Sameness task 
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3. Opposition task. This task was in two parts. (a) 


Opposition Acquisition: From a set of eight wooden 
discs ranging in diameter from .5 inch to 2.25 inches and 
hidden from the subject’s view, four were selected in a 


fixed order and randomly thrown on the table in front 





of the child. The experimenter either pointed to the 
largest or to the smallest disc. If the experimenter 
picked the largest, the subject’s task was to discover 


that he had to pick the smallest; if the experimenter 
picked the smallest, the subject had to pick the largest 
given a maximum of 36 trials. The 
criterion correct (d) 
Opposition Transfer: One uncorrected trial was given 
only to those subjects who succeeded on Opposition 
Acquisition, on each of the following six Transfer 
dimensions: Volume, Length, Number, Brightness, 
Position, and Texture. While the experimenter pointed 
to the stimulus on one extreme of the continuum of 
four or five stimuli, the child showed transfer of concept 


Subjects were 


was six consecutive choices 


by pointing to the opposite extreme 

Che performance of the two groups was compared 
in terms of the proportion of subjects reaching the 
Other measuring performance 
yielded similar trends and are not reported here except 
For Opposition 


criterion methods of 


in the case of Opposition Acquisition 
lransfer the total number of nontransferred responses 


was used for computational purposes 


RESULTS 

The results for the three Acquisition tasks 
are presented in Table 1. The chi squares are 
based on the actual number of passing and 
failing subjects, a total of 30 for each age group 
and 180 for each entire sample. 

The comparative results for the Sameness 
and Symmetry tasks were fairly similar. The 
hearing group showed no superiority over the 
deaf; on the contrary, there was a tendency 
for the younger deaf children to surpass the 
hearing children of comparable age. Also, the 
overall comparisons for all ages combined on 
the Symmetry task was significantly in favor 
of the deaf group. 

Both groups showed a significant change of 
proportional success with age: on the Sameness 
task the chi square (df = 5) for the hearing 
group was 18.27 (p < .01) and for the deaf 
group 13.75 (p < .02); on the Symmetry 
task the respective values were 23.04 and 29.00 
(p < .01). While this change was consistently 
in a positive direction for the hearing children, 
the 11-year-old deaf group reversed the trend 
for the deaf sample: the chi square, comparing 
the successes of the 11- and 10-year-old deaf 
children for the Sameness task was 5.42 (p 


< .05) and for the Symmetry task 4.27 (p 
< .02) 
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TABLE 1 


AND Dear SuByects ON Concept Tasks In EaAcu 


AGE Group 


(30 subjects in each age group 


NUMBER OF SuCCESSFUL HEARING 
Age 
He I Deaf x Hearing 
7 4 7 44 6 
8 6 16 5.81* 
9 12 16 60 
10 11 20 4.27* 12 
11 13 11 07 
12 22 15 2.54 2 
Tota 6S 85 2.91 65 
** ( 


Hearing children were significantly superior 
to deaf children on the Opposition Acquistion 
task, as indicated by the chi square testing for 
overall differences. The consistency of this 
superiority at each age level is more clearly 
revealed in Table 2. The median number of 
errors for the combined samples was 3.05. 
Analyzing these data for improvement of 
performance with age, the chi square values 
of 27.67 for the hearing group and 14.17 for 
the deaf group were found to be significant 
at the .01 and .02 levels, respectively, (df = 
5). 

Table 3 provides the relevant data for the 
Transfer test. As the nontransferred responses 
were fairly normally distributed they were 
subjected to an analysis of variance. For this 
purpose 34 hearing subjects were randomly 
eliminated so that the N of corresponding 
cells for each age group was equal. The analysis 
yielded an F of 24.02 for the Hearing-Deafness 
variable, an F of 9.42 for the Age variable, 
and an F of 4.50 for the interaction. The signifi- 
cant (p < .01) interaction and differences 
obtained between the two samples at each age 
level indicated that the hearing children were 
superior, the degree of superiority varying 
with age. 


DISCUSSION, 


The facilitative influence of language was 
highlighted in the present study when one 
considers that the deaf children who 
demonstrated their equality on the Sameness 


same 


and Symmetry tasks were consistently below 
the same hearing children on the Opposition 





Symmetry Opposition Acquisition 
Deaf x? Hearing Deaf x? 
4 12 29 14 16.00** 
11 3.20 26 19 3.20 
19 6.72" 28 23 2.09 
21 4.31* 30 28 31 
11 27 30 28 31 
19 0s 30 27 1.40 
85 4.13* 173 139 26.18** 
TABLE 2 


NUMBER OF SuBjECTS MAKING LESs THAN 
MEDIAN ERRORS ON OPPOSITION 
ACQUISITION 





Age Hearing Deaf Chi square 
7 8 5 5.79* 
8 9 3 3.75 
9 16 7 $5.41" 
10 17 8.5 4.93* 
11 19.5 i) 7.37** 
12 25 10 15.43** 
Total 94.5 38 36.79** 
° p= .0S 
oT ? = ( 
TABLE 3 
MEAN NONTRANSFERRED RESPONSES OF 
DEAF AND HEARING SUBJECTS ON 
OPpposITION TRANSFER IN EACH 
AcEfGrRouP 
Hearing Deaf 
Age 
. Mean \ Mean 
: responses responses 
7 29 2.00 14 2.78 
8 26 1.54 19 2.80 
) 28 1.25 23 2.17 
10 30 73 28 1.61 
11 30 60 28 1.78 
12 30 57 27 1.48 
Total 173 1 10 139 2.00 


task. On the two former problems, the deaf 
children in the lower age range were actually 
superior, perhaps because of their less sophisti- 
problem situation 


cated approach to the 
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LANGUAGE AND CONCEPT FORMATION 


Insofar as the problem required the attainment 
of one simple concept, the hearing child’s 
greater store of available categories may have 
heen distracting. 

Regarding the developmental trend in the 
hearing group, the findings on the Symmetry 
task are in excellent agreement with the results 
of Levy and Ridderheim (1958) on an entirely 
different sample. The drop in the relative 
performance of the 11- and 12-year-old deaf 
group is somewhat puzzling, if it is not a 
sampling artifact. 

Successful performance on these concept 
learning tasks, it should be understood, is 
related to but in no way identified with the 
knowledge of the concept or of the word. Al- 
though both deaf and hearing children at the 
youngest age level knew the word or the con- 
cept of sameness, only few of them succeeded 
on the Sameness task. At the other extreme, 
neither deaf nor hearing 7-year-old children 
were familiar with the word “symmetry,” 
yet a few 7-year-old children succeeded on that 
particular task. 


“ 


SUMMARY 


Contrary to widely accepted conclusions 
that deaf people are inferior in conceptual 
thinking and the theories proposed to link 
conceptual inferiority and language retarda- 
tion, the present study suggested that the in- 
fluence of language on concept formation is 
extrinsic and specific. According to this view, 
language experience may increase the efficiency 
of concept formation in a certain situation, 
but is not a necessary prerequisite for the 
development of the basic capacity to abstract 
and generalize. 

To test this assumption, 180 deaf and 180 
hearing subjects, 30 subjects for each age 
group from 7 to 12 years, were given three 
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nonverbal concept learning tasks differing 
with respect to the relevance of the language 
experience of the two samples. With regard to 
the Sameness and Symmetry task, the groups 
were assumed to be equivalent in relevant 
language experience. In contrast, concerning 

Opposition Acquistion and Transfer, specific 

language experience was assumed to give 

hearing children an advantage over the deaf. 

Accordingly, it was predicted that the hearing 

subjects would not be superior to the deaf 

subjects in their performance on the Sameness 
and Symmetry tasks, yet would be superior 
on Opposition Acquisition and Transfer prob- 
lems. The results confirmed the experimental 
predictions and gave support to the proposed 
theory. 
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INTERPERSONAL SENSITIVITY AND MOTIVE STRENGTH 


DAVID E. BERLEW! 


Wesleyan University 


NTERPERSONAL sensitivity implies em- 
pathy, understanding, ability to judge 
others, sensitivity to other people, and 

other similar concepts. Efforts to investigate 
interpersonal sensitivity as a personality trait 
have been frustrated by the lack of a satisfac- 
tory technique for measuring sensitivity. In 
the last decade the general procedure has been 
to ask judges (/s) to predict the responses of 
others (Os) to a questionnaire of some sort, 
and then to compute accuracy scores by com- 
paring each J’s predictions with the actual 
responses of the Os. Recently a number of 
studies have demonstrated that accuracy 
scores obtained in this way are more the result 
of statistical artifacts and response sets than 
of any differential predictive accuracy on the 
part of the Js (Bender & Hastorf, 1953; Cron- 
bach, 1955; Crow & Hammond, 1957; Gage & 
Cronbach, 1955; Gage, Leavitt, & Stone, 1956; 
Hastorf & Bender, 1952). Several investigators 
have suggested procedures for measuring vari- 
ous aspects of interpersonal sensitivity which 
yield scores apparently free from response set 
influence (e.g., Cronbach, 1955; Gage, 1952). 
Bronfenbrenner and his co-workers (Bron- 
fenbrenner, Harding, & Gallwey, 1958) at 
Cornell University used such a procedure for 
obtaining interpersonal sensitivity scores that 
were apparently reliable and which related 
significantly to personality (behavior) vari- 
ables. 

Using Bronfenbrenner et al. (1958) technique 
for obtaining interpersonal sensitivity scores 
free from artifactual components, the present 
study investigated the relationship between 
motivation and sensitivity to others. The 
hypothesis was that moderately motivated Js 
would make more accurate judgments of 
Os than either highly motivated or relatively 
unmotivated Js, 

If judging others is viewed as a complex 
problem solving task, certain empirical findings 
are relevant to the hypothesis. Most rele- 
vant, perhaps, is the Yerkes-Dodson principle 
1 out while the author was a 


This study 


graduate student in the Department of Social Relations, 


was carr 


Harvard University 


(Yerkes & Dodson, 1908) that moderate 
motivation is optimal for efficient performance 
on complex tasks. Since Yerkes and Dodson 
made their discovery in 1908, other studies 
have demonstrated that their conclusion ap 
plies to a variety of cognitive processes essen. 
tial to efficient problem solving. Motive in. 
tensity and attention to motive related cues are 
positively related (Atkinson & Walker, 1956; 
Wispé & Drambarean, 1953), but at the upper 
extremes of motive intensity there is some loss 
of perceptual objectivity (Allport, 1955, Ch 
13). Moderate motivation facilitates “generic” 
or “rule” learning, learning which has maxi- 
mum transfer effect in other similar situations 
(Bruner, 1957). Birch (1955) indicates that 
moderate motivation is optimal for “‘insight- 
ful problem solving.”” Taken together, these 
results suggest that moderate motivation is 
optimal if a J must perceive a variety of mo- 
tive related cues and perceive them veridically, 
have learned something about the meaning 
of identical or similar cues in past situations, 
and process such information to arrive at an 
accurate solution or judgment. 

Estimating the motivational level of the Js 
is crucial to any test of the hypothesis. Me- 
Clelland and his co-workers (Atkinson, 1958; 
McClelland, Atkinson, Clark, & Loweu!, 1953) 
have had considerable success inferring the 
strength of various motives from imaginative 
stories. Atkinson (1953), however, has pointed 
out that motivation inferred from content 
analysis of fantasy is probably a “latent 
characteristic of personality which is mani- 
fested in behavior only when engaged or sup- 
ported by appropriate environmental cues” 
(p. 387). To estimate the intensity of a specific 
motive in a given situation, one must know, 
in addition to the strength of the motive pre- 


uation 


disposition, the extent to which the situation 
engages or arouses the motive. 

Fortunately, there are data that provide 
information as to the motive engagement or 
arousal properties of task oriented small 
groups. Factor analytic studies of such groups 
indicate that group members perceive and 
rate each other in terms of three basic dimer 
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INTERPERSONAI 


ns—task competence, influence, and affec- 
ion (Tagiuri, 1958). Thus we can assume 
sat information or social cues built into the 
groups are relevant to these three dimensions 

the relative exclusion of cues relevant to 
her dimensions. Since the motives n Achieve- 
ment, n Power, and n Affiliation correspond 
very closely to task competence, influence, 
ind aflection, respectively, it can be argued 
that these three motives particularly should 
be engaged or aroused by the small group 
situation. 

The problem of obtaining estimates of the 
intensity of the subjects’ motives in the group 
situation still remains. One solution is to 
ittempt to design a group situation that 
arouses all three motives equally. If this can be 
accomplished, the contribution of the situa- 
tion to motive arousal can be treated as a 
constant across all three motives, and the 
fantasy measure of motivation (or motive 
predisposition) used as an estimate of motive 
strength. This is admittedly a gross method 
for obtaining estimates of motive intensity in a 
particular situation, but the desirability of 
studying several motives simultaneously seems 
to warrant a departure from more conventional 
ods of controlling motive intensity even 





meth 
it the expense of some rigor. 

In light of the foregoing reasoning, it was 
predicted that Js with a moderate n Achieve- 
ment, n Power, and n Affiliation would make 
more accurate judgments of Os in a task 
mented small group situation than either 
highly motivated or relatively unmotivated Js. 

METHOD 
Subjects and Setting 

The subjects were male members of a Harvard 

ege social relations class. The subjects were invited 
) participate in small informal discussion groups as 
reparation for an approaching examination. Fight six- 
man discussion groups were scheduled. Except for the 
mdition that students who were acquainted could 
the 
'r any group that met at a time 
1. Due to absentees, only three of 


be members of same group, subjects were 
alowed to sign up fe 
nvenient for then 


fall camnieme fa =o 
a full co npiement ot SIX Members 





other five groups had five participants. Each 

er i discussed the course material for 2.5 
s his period they were not aware that 
rs to other group members was to be 





factors calculated to moderately engage each of the 
lives being studied were present in the situation 
material in preparation 


scussing course for an 
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examination should engage the subjects’ n 
ment; the facts that the test in question was an “hour 
exam” and not a final, and success on it did not neces 
sarily depend on what they contributed to or got 
from the group discussion, were expected to mollify 
what otherwise might be a highly arousing situation 
Similarly, the lack of a designated chairman or leader 
for the discussion, and the fact that group members 
were meeting each other for the first time, were designed 
to moderately engage the power and afiiliation motives, 


respectively 


Measure of Interpersonal Sensitivity 


Following the discussion period, each group member 
was asked to rate himself from 0 to 6 on the series of 12 
adjectives, half representing desirable and the other 
half undesirable qualities, originally used by Bronfen- 
brenner et al. (1958, p. 51).2 The instructions were 
“Describe youself with respect to the way you acted 
during the discussion in which you have just partici 
pated.” After he had rated himself, each group member 
was asked to predict how each other group member had 
rated himself on the same list of adjectives 

Interpersonal sensitivity (IS) scores were computed 
using Bronfenbrenner’s “method of differential com- 
Che method of computing such a score for a 


parison.” 
expressing his prediction of 


single subject consists of 
each other 
item as a deviation (a 
for all the other group members on that item, expressing 


group member’s self-rating on a specific 


) from the mean of his predictions 


each other group member’s self-rating on a specific item 
as a deviation (,) from the mean of all the other group 
members’ self-ratings on that item, and then computing 
correlation (r2,,) between 2, and »; over 


the simp! 
’ The resulting correlation coefficient reflects 


all items 
the subject’s ability to recognize individual differences 
among other persons in their responses to each item 
Since both the criterion ratings for evaluating the 
accuracy of predictions and the predictions themselves 
are expressed as deviations from the respective means 
for all members of the group, the measure is independent 
of the subject’s similarity and sensitivity to the gen 
eralized other. After a statistical analysis of sensitivity 
scores obtained using the “method of differential 
comparison,” Bronfenbrenner et al. (1958) state: “It is 
our belief that the only way in which a judge can obtain 


a high score (i.e., a high correlation) with this type of 


2 The “desirable” or “favorable” adjectives, adopted 
from Bronfenbrenner, Harding, and Gallwey (1958), 
were helpful, influential, interesting, observant, reason 
The “unfavorable” adjectives were 
shy, submissive, unimagina 


able, and warm. 
domineering, 
tive, and worried 

3 According to Bronfenbrenner, Harding, 


Gallwey (1958), this index corresponds to the “average 


immature, 
and 
within-group correlation” in conventional analysis of 
covariance and can be treated in exactly the same way 
as the ordinary correlation coefficient except for the 
fact that the usual methods for evaluating statistical 
significance of r cannot be applied directly. Because 
successive paired values cannot be assumed independ 
ent, the exact number of degrees of freedom associated 
with r is not known 
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index—except, of course, through errors of measure- 
ment—is by being aware of objective differences among 


(p. 49) 


the persons he is asked to judge”’ 


Measures of Motivation 


Six-card, self-administered Thematic Apperception 
Test (TAT) booklets were completed by the subjects 
during a regular class hour prior to the scheduling of 
the small group meetings. The subjects were not aware 
of any association between the TAT and the group 
meetings. The rAT pictures used comprise a 
standard set recommended for use when distributions 
of scores for the motives, n Achievement, n Affiliation, 
and n Power, are desired (Atkinson, 1958). The TAT 
protocols were scored for the three motives prior to the 
small group meetings using coding systems developed 
by McClelland and his co-workers (Atkinson, 1958). 
The experimenter, the protocols, has 
demonstrated reliability of over .90 with the practice 
materials included in the three scoring manuals. In 
addition, reliability coefficients between the experi 
menter and a second experienced scorer were .95, .93, 
and .87 for n Achievement, n Affiliation, and n Power, 
respectively, on a sample of one-third of the protocols 


six 


who scored 


used in the study 
RESULTS 

The IS scores, in the form of correlation 
coefficients, ranged from —.31 to +.50, with 
a mean of .23, a median of .29, and a standard 
deviation of .21. As noted previously (see 
Footnote 3), the usual methods for evaluating 
statistical significance of r cannot be directly 
applied to the IS scores because the exact 
number of degrees of freedom cannot be de- 
termined.‘ 

The eight discussion groups did not differ 
with respect to mean and standard deviation 
of IS scores. For purposes of analysis, Js were 
designated as Sensitive or Insensitive according 
to whether their IS scores were above or below 
the median of the entire distribution of scores. 

The distribution of scores for each motive 
was divided as nearly as possible into thirds 
and each J classified as high, moderate, or 
low on each motive. In Table 1 Sensitive and 
Insensitive Js are classified according to the 
strength of each motive. 

The predicted curvilinear relationship be- 
tween the motives, n Power and n Affiliation, 
and IS scores is significant. Sensitive Js tend 
to be characterized by moderate n Power and 
n Affiliation, whereas Insensitive Js tend to 


* Treating IS scores as r’s to obtain a rough idea of 
how successful the subjects were in making judgments, 
27 of the 43 subjects exceed chance accuracy at the 
05 level, with 17 of these 27 subjects attaining scores 
significant at the .01 level. 
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TABLE 1 
NUMBER OF SENSITIVE AND INSENSITIVE Js Oprayy 
ING Low, MODERATE, OR HIGH Scores 
ON n ACHIEVEMENT, n AFFILIATION, 
AND n POWER 


n Achieve- 


sant n Affiliation n Power 
Interpersonal 
sensitivity w PY) e 
EIS |SlelSisielsls 
ot Eicis E SEIS/iEé€\3z 
Sensitive Js 6/9/7 12; 5/6/12/4 
(N = 22) 
Insensitive Js 9/4;8/'8/;5j;8!8!3 i109 
(N = 21) 
Chi square* 2.43 4.25 7.67 
p (two-tailed) | .20 | .05 01 


* Computed with one df by combining H and L cells to forms 


2 X 2 table. 


have either high or low n Power and n Affiilia- 
tion. The relationship between n Achievement 
and IS scores, although in the predicted direc. 
tion, is insignificant. In considering these 
results, it is important to note that the three 
motives are not positively related; in fact, 
there is a significant negative correlation be. 
tween n Power and n Achievement (r = —.38 
p < .01). While this negative relationship 
increases the probability that particular in- 
tensity patterns relative to the two motives 
will occur (i.e., moderate-moderate, and high- 
low), there is no reason to believe that it con- 
tributes spuriously to the relationship between 
motive intensity and interpersonal sensitivity 

To test the relationship between interper- 
sonal sensitivity and general level of motiva- 
tion, each motive score was converted to a 
z score representing its distance in standard 
deviation units from the mean of the motive 
score distribution. A total s score was com- 
puted for each J by summing the = scores for 
each motive, disregarding signs. Thus a high 
total z score represents extremes in motive 
intensity (either high or low) and a low total 
z score represents moderate motive intensity 
across all three motives. 

The mean total z scores obtained for Sen- 
sitive and Insensitive Js were 2.075 and 2.8%, 
respectively. The mean difference of .822 
yields a ¢ of 3.000 which is significant at the 
.005 level, using a one-tailed test of significance 
An attempt to assess more sensitively the 
covariation of motive intensity (total 2 scores 
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INTERPERSONAL SENSITIVITY AND MOTIVE STRENGTH 


and IS scores yielded an r of —.20, just signifi- 
cant at the .05 level. 


DISCUSSION 


The results clearly support the hypothesis: 
moderately motivated Js are distinctly su- 
serior to highly motivated and unmotivated 
Jsat judging Os in a task oriented small group 
situation. The same trend is apparent whether 
one motive or all three motives are used as 
estimates of motive intensity. 

Since the group sessions were neither ob- 
served nor tape recorded, it was impossible to 
determine objectively whether or not there 
was roughly the same amount of social inter- 
action relevant to each of the three dimensions, 
tak competence, influence, and affection. 
However, the experimenter got the general 
impression that there was from interviewing 
the subjects. The fact that a significant pro- 
portion of the variance in IS scores can be 
attributed to differences in Js’ motivation 
also suggests that the attempt to control 
situational factors was relatively successful. 
It is doubtful whether the predicted curvilinear 
relation would have obtained for each motive 
and sensitivity had one motive been aroused 
disproportionately. For example, a situation 
extremely arousing to a particular motive 
would serve to moderately arouse or motivate 
even those Js with a relatively weak motive 
predisposition, thereby enhancing their ability 
to make accurate judgments of Os. In the 
same situation, Js with a moderately strong 
latent motive would become highly aroused 
and thus be relatively poor at judging Os, 
while Js with a strong latent motive would 
become totally inept as Js. In this hypothetical 
instance, the relationship between latent 
motive intensity and sensitivity would be 
linear and negative rather than curvilinear. 
A situation with a very slight tendency to 
engage or arouse a motive might conceivably 
produce just the opposite effect—a positive 
linear relationship between latent motivation 
and sensitivity. 

One question that might be asked is whether 
the obtained results cannot be accounted for 
in some simpler way, without reference to the 
telationship of motive intensity to sensitivity 
cf. Cronbach, 1958). One point worth reiterat- 
ing is that the estimates of motive intensity 
and interpersonal sensitivity are experimen- 
tally independent. Thus except for the pos- 
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siblity of measurement error, the empirical 
relationship between the two variables can 
be accepted as genuine. The meaning of the 
results is quite another problem. One might 
ask, for example, whether it is not possible 
that persons with low motivation use only the 
middle of the rating scale, whereas highly 
motivated subjects concentrate their ratings 
at the extreme ends of the scale, resulting in 
low IS scores for both groups? The answer to 
this and similar questions is that subjects 
differing in motive intensity may indeed use 
the rating scales in consistently different 
ways, but that is part of the reasoning under- 
lying the prediction the study was designed 
to test. Subjects with low motivation were 
expected not to be attentive to interpersonal 
cues, and as a result either refuse to differ- 
entiate or be forced to make wild guesses. 
Highly motivated subjects, on the other hand, 
were thought to be predisposed by their needs 
and expectations to see Os in certain ways, 
so their judgments should reflect a tendency 
not to differentiate, or to make only gross, 
dichotomous discriminations among persons. 
In either case, motivation affects the pattern- 
ing of responses, and the patterning of re- 
sponses is one factor that affects IS scores. 
The study does not attempt to separate the 
effect on interpersonal sensitivity of a tend- 
ency not to differentiate among Os and 
efforts to differentiate that are inept. Both 
factors operate to determine the sensitivity 
of a person’s interpersonal perceptions. 

It should be pointed out that while subjects 
were encouraged by the instructions to try to 
discriminate among Os, the form of the test 
instrument did not force them to do so. This 
study does not answer the question as to how 
well subjects characterized by different degrees 
of motivation might be able to judge Os if 
actually forced to differentiate. However, if 
the arguments just presented have any valid- 
ity, the same hypothesis should apply. 

Several comments concerning the interpre- 
tation of the results of this study are in order. 
Ability to judge Os’ self-percepts may eventu- 
ally prove to be a convenient index of inter- 
personal sensitivity, but for the present it 
must be considered one discrete skill that may 
or may not be related to other aspects of sen- 
sitivity to others. Assumptions regarding the 
level of motive intensity of subjects in the 
group situation are tenuous, at best; more 








394 


rigorous means of controlling the contribution 
of the situation to motive arousal should be 
developed or adopted. Finally, no conclusions 
can be drawn concerning the relationship 
between the intensity of a specific motive and 
ability to make certain kinds of judgments 
(e.g., judgments related to that motive). 
As Cronbach (1955) has pointed out, no sen- 
sitivity score can be thoroughly understood 
until it has been broken down into discrete 
and meaningful component parts. 


SUMMARY 


On the assumption that judging other people 
is a special case of complex problem solving, 
the Yerkes-Dodson (1908) principle concerning 
the relationship between problem solving 
efficiency and motive intensity should apply. 
The hypothesis tested was that Js with mod- 
erate n Achievement, n Affiliation, and n 
Power would make more accurate judgments 
of Os in a task oriented small group situation 
than either highly motivated or relatively 
unmotivated Js. 

Six-card TAT booklets were completed by 
43 male undergraduates and coded for n 
Achievement, n Affiliation, and n Power. The 
subjects then met in groups of five or six to 
prepare for an undergraduate examination. 
At the end of 2.5 hours of discussion, each 
group member was asked to rate his behavior 
in the group, and then to predict how each 
other group member had rated himself. Inter- 
personal sensitivity scores free of response 
set influence were computed using Bronfen- 
brenner’s “method of differential comparison.” 

The results supported the hypothesis: Js 
with moderate motivation as reflected by an 
index combining the three motive scores ob- 
tained higher sensitivity scores than either 
highly motivated or relatively unmotivated 
Js. Need for Power and n Affiliation, taken 
separately, showed a similar relationship to 
interpersonal sensitivity scores despite the 
fact that the distributions of motive scores 
were not positively related. 
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hep THE CONDITIONING OF VERBAL BEHAVIOR AS A FUNCTION OF 


oP THE NEED FOR SOCIAL APPROVAL 
ho DOUGLAS P. CROWNE ann BONNIE R. STRICKLAND! 
hliation Ohio State University 
ces. J 
a wa series of recent studies (Crowne & and psychotherapy (Krasner, 1958; Ullmann, 


tivity Marlowe, 1960; Marlowe & Crowne, Krasner, & Collins, 1961), and the reinforcers 
1961; Strickland & Crowne, in press), typically employed have been of a “social” 
‘ors in / the behavior of individuals displaying a socially nature (e.g., ““mm-hmm,” “good,” etc.). Thus, 
mn desirable response set on personality indices this experimental situation appeared to provide 

has been investigated both in psychometric a task particularly appropriate to investigate 
ey, M. | situations and on experimental tasks involving __ the effects of a need for social approval. Second, 
ion. In | pontest measures as dependent variables. verbal conditioning constitutes a difficult 
pe These studies suggest that it is meaningful and exacting test of the hypothesis because 
1953.» to conceptualize socially desirable responding subjects are typically unaware of the rein- 

on personality inventories as a function of a forcement contingency. Hence, the need for 
en. In, ) need for social approval. In contrast to more social approval would have to exert its in- 
ridge, | descriptive approaches to the problem of fluence without the mediation of awareness 


69 . > <7): ; 
unde. | stereotyped responding (Edwards, 1957), it of the correct response-reinforcement sequence. 
rity.” | has been assumed in these investigations that The major hypothesis of the study can be 


self-evaluative styles in psychometric situa- stated as follows: Individuals whose need for 
treat- ? tions reflect personality characteristics of the social approval is high increase their rate of 
ur eee es d on Sane 2 . 

1 & | respondent (cf. Couch & Keniston, 1960; response in a verbal conditioning task in which 








ao Jackson & Messick, 1958). The conceptualiza- reinforcement involves approval to a greater 

tion of need for social approval provides a degree than those with a weaker approval 
ity of | predictive link between socially desirable need. Specifically, under conditions of positive 
“9 responding on the scale described below and reinforcement, subjects with a high approval 
ns the degree of yielding to the perceived demands need, in contrast to subjects less motivated 
0). of nontest situations. The behavioral depend- for approval, tend to show a greater increase 
| and ent variables that have been investigated jn the reinforced response class. But further, 
—_ include conformity in a simulated Asch situa- jt follows that approval motivated individuals 
The tion (Strickland & Crowne, in press) and the should be more affected by negative reinforce- 
sona favorability of attitudes expressed towards ment in interpersonal situations, since this 
a boring task (Marlowe & Crowne, 1961). connotes denial of approval and: punishment. 
cting While the postulation of a need for social Thus, within the compass of the major hypoth- 


"approval gives theoretical consistency to the esis, it was additionally hypothesized that 
¥ | results that have been obtained, further sup- subjects with a high need for approval tend 
New port of this motivational conception is clearly to show a greater decrease in the negatively 

desirable. This experiment was designed to reinforced or punished response class than 
In | assess further the motivational properties of subjects to whom approval satisfactions are of 


ption . 
‘ali the need for social approval and to attempt to Jess consequence. 
extend the meanings of the construct. . 
é 8° ; METHOD 
gical A verbal conditioning paradigm was chosen 
resh for two reasons. First, the verbal conditioning Subjects 
- situation has been widely considered as a The subjects in this experiment were 145 under- 
ion, | Mumature and simplified model for more com- graduate students in introductory psychology classes 


at the Ohio State University, 74 males and 71 females, 
who volunteered to serve in a study of “verbal reac- 

‘The authors would like to acknowledge the helpful __ tions.’’ Subjects were randomly assigned to the experi- 
advice and suggestions given by Reed Lawson, mental conditions as they appeared for appointments. 


plex interpersonal situations like interviewing 


Shephard Liverant, and Delos D. Wickens in the de- Male subjects were run by a male experimenter and 
sign and analysis of this study and in the preparation female subjects by a female experimenter in both rein- 
of this report. forced conditions. This restriction was not placed on the 
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assignment of subjects to the four experimenters who 
ran the nonreinforced control group. 


Procedure 


The verbal conditioning task employed was the 
plural nouns paradigm of Greenspoon (1955). Each 
subject was brought to the experimental room and 
seated directly facing the experimenter. (In Green- 
spoon’s procedure, the experimenter sat behind and out 
of sight of the subject.) The subject was then informally 
asked his year in college, major, and vocational aspira- 
tions. The purpose of this brief and structured inter- 
action was to get the subject to talk about himself to 
the experimenter and to establish the experimenter as 
a source of approval satisfactions. 

The experimental task itself was then introduced, 
employing the following modification of Greenspoon’s 
instructions: 

What I want you to do is to say all the words you 
can think of. Say them individually. Do not use any 
sentences or phrases. Do not count. Please continue 
until I say stop. I will be writing down some things 
as you say the words. Do you have any questions 
at this point? All right, go ahead. 

The subject then said words for 25 minutes. In the 
positive reinforcement condition, every plural noun 
uttered by the subject was immediately followed by the 
experimenter’s ““mm-hmm” and a head nod. The sub- 
jects in the negative reinforcement condition elicited 
an “uh-uh” and a shake of the head for each plural 
noun spoken. A nonreinforced control group was em- 
ployed to establish the base rate of plural nouns, and 
subjects in this condition said words for the duration of 
the experimental time in the absence of any verbal re- 
inforcement by the experimenter. For recording purposes, 
the conditioning task was divided into five 5-minute 
periods, during which a frequency tabulation was kept 
of the subject’s responses, plural and nonplural, which 
the subject was not permitted to see. In the course of 
this procedure, the subject’s spontaneous questions 
were briefly answered only if they involved a necessary 
clarification of the task. The subject was reminded of 
the instructions if at any time he stopped or gave unac- 
ceptable responses (counting, using sentences, etc.). 
In actuality, few subjects required further elaboration 
or reminders. 

At the end of 25 minutes, the subject was stopped 
and asked the following questions: 

1. What do you think it was all about? 

2. How did you go about deciding which words to 

say? 

3. Did you notice any change in the kind of words 
you were saying? 

4. Was there anything I did that you particularly 
noticed? (If ““mm-hmm” [“‘uh-uh”’] was not spon- 
taneously mentioned, the subject was then asked: 
What about my saying “mm-hmm” [“uh-uh’’}? 

5. What do you think the purpose of that was? 

These questions were adapted from those employed in 

several studies reviewed by Krasner (1958) and were 

considered to represent a reasonably careful test of the 
subject’s awareness of the response being reinforced. 

Where necessary, the experimenter attempted to clarify 

ambiguous responses by additional probing. 

As a measure of performance on the conditioning 
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task, since the sheer number of words spoken potentially 
could affect (and thereby confound) the results, each 
subject’s score was expressed as the ratio of plural 
nouns to the total number of words emitted, thus con. 
trolling for the effect of verbal output. 


Inventory 


The measure of the need for social approval was the 
M-C SD scale (Crowne & Marlowe, 1960), a 33-item, 
true-false questionnaire. The need for approval is in. 
ferred from the extent of the subject’s endorsement of ) 
the socially approved and desirable characteristics 
which the items represent. An illustrative item is, “] 
have never intensely disliked anyone.” 

In the experimental procedure, the M-C scale and 
the verbal conditioning task were presented in counter- 
balanced order to eliminate an order effect. It is im- 
portant to note that for all subjects the personality 
inventory was not scored until after the completion of } 
the entire procedure. There was, thus, no possibility 
that the experimenter’s knowledge of the M-C SD score 
could influence the results. 


RESULTS 


Preliminary to the analysis of the data, 19 
subjects were discarded. One subject was 
omitted for failure to comply with the instruc- 
tions (counting); 15 subjects were dropped 
from the negative reinforcement condition 
because they correctly verbalized the rein- 
forcement contingency; 1 subject was excluded 
from the positive reinforcement group for the 
same reason. Two subjects were dropped by 
means of random selection from the positive 
reinforcement condition in order to obtain 
proportionality of the cell Ns for the analysis 
of variance. Of the final NV of 126 subjects, 42 
were in each of the three conditions. These 
subjects were further divided by dichotomizing 
the scores on the M-C scale at the overall 
mean (16.82) to yield the high and low need- > 
for-approval groups. The experimental design 
was, thus, a 2 X 3 factorial with repeated 
measures across the treatments. The analysis 
of the data was accomplished by means of 
Grant’s (1956) orthogonal polynomial trend 
analysis. 

In the two specific hypotheses of the study, 
it was predicted that subjects with a high 
need for social approval show an increase in 
plural nouns under positive reinforcement 
and a decrease in plural nouns under negative 
reinforcement in contrast to subjects whose 
need for social approval is lower. Figures ! 
and 2, respectively, show the curves of the 
high and low need-for-approval groups under 
the positive and negative reinforcement con | 
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5S MINUTE PERIODS 


Fic. 1. Proportions of plural nouns given by high 
and low need approval groups under positive reinforce- 
ment and nonreinforced control conditions. 


ditions and compare them with the curves of 
highs and lows in the nonreinforced control 
condition. From the curves in Figure 1, it 
would appear that the high need-for-approval 
group shows the predicted increase in the 
frequency of plural nouns, while the frequency 
of plural nouns for the low group is intermedi- 
ate between the highs and lows of the control 
condition. In Figure 2, the high group seems 
to decrease in the use of plural nouns, as pre- 
dicted, with the low group again occupying 
a position between the nonreinforced high and 
low need-for-approval groups. 

Table 1 presents the results of the analysis 
of variance. There is a significant departure 
from a zero slope (overall trend F = 3.83, p 
< .01) with curves following both linear and 
cubic functions. It is clear that change in the 
proportion of plural nouns occurred in the 
course of the experiment. Of major interest to 
the hypothesis are the significant between- 
group means effects. Inspection of this portion 
of Table 1 reveals both that the reinforcers 
were influential (F = 5.25, p < .01) and that 
they were differentially effective on the need- 
for-approval groups (interaction F = 6.77, p 
< .005). Considering first the positive rein- 
forcement data, to isolate the differences, a 
{test was run between the means of the posi- 
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5S MINUTE PERIODS 


Fic. 2. Proportions of plural nouns given by high 
and low need approval groups under negative rein- 
forcement and nonreinforced control conditions. 


tively reinforced high and low need-for-ap- 
proval groups for the first time period, yielding 
a nonsignificant value of 1.21. It appears that 
at this early stage, verbal conditioning had not 
begun to differentiate the high need-for- 
approval group from the other groups. Thus, 
the differences between the reinforced high 
need-for-approval group and the low or control 
groups are to be found later in the task with a 
consistent superiority of the reinforced high 
group during the remainder of the conditioning 
periods. The smallest of these differences, that 
between the positively reinforced high and 
low need-for-approval groups for the fourth 
5-minute period, was tested by means of ¢. The 
obtained value was 2.20, which is significant 
beyond the .02 level. (One-tailed tests were 
employed in this and subsequent analyses 
because directional differences were specified 
in advance of the study.) It is interesting to 
note that the cubic between-group trends 
terms of the analysis of variance reached 
borderline significance, suggesting some tend- 
ency for the curve of the positively reinforced 
high need-for-approval group to depart in 
slope and form from the other curves. 

In the negative reinforcement comparison, 
t tests were also run between certain of the 
means of the reinforced and nonreinforced 
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TABLE 1 


ANALYsIS OF VARIANCE OF HIGH AND Low NEED 
APPROVAL GROUPS UNDER THE THREE 
REINFORCEMENT CONDITIONS 
=< 
Source df MS F 
A. Overall Trend (4) | .031839 | 3.83°° 
Linear 1 .075927 | 6.09° 
Quadratic 1 | .006956 | 0.86 
Cubic 1 .043719 | 6.87°* 
Quartic 1 .000754 | 0.12 
B. Between-group means (S) | .261322 | 4.82°°°° 
a. Need Approval 1 -003586 | 0.07 
b. Reinforcement condition 2 284475 | 5.25°* 
c. Interaction 2 .367035 | 6.77°** 
C. Between-group trends (20) .008762 | 1.05 
1. Linear (5) .008948 | 0.72 
a. Need Approval 1 -014859 | 1.19 
b. Reinforcement condition 2 .010737 | 0.86 
c. Interaction 2 .004203 | 0.34 
2. Quadratic (5) .009420 | 1.16 
a. Need Approval 1 .005375 | 0.66 
b. Reinforcement condition 2 -O17112 | 2.11 
c. Interaction 2 .003750 | 0.46 
3. Cubic (5) -012002 | 1.89 
a. Need Approval 1 -003226 | 0.51 
b. Reinforcement condition | 2 013586 | 2.14 
c. Interaction 2 .014805 | 2.33 
4. Quartic | (5) | -004677 0.74 
a. Need Approval 1 .001788 | 0.28 
b. Reinforcement condition 2 .006786 | 1.07 
c. Interaction 2 004011 | 0.63 
D. Between-individual means 120 | .054212 
E. Between-individual trends (480) 008310 
1. Linear 120 012465 
2. Quadratic 120 .008092 
3. Cubic 120 . 006363 | 
4. Quartic 120 .006319 
F. Total | 629 
*p< .02 
p< M1. 
"99 < .005. 


sere» < .00t. 


groups. Comparing the high and low need- 
for-approval groups under negative reinforce- 
ment, significant differences were obtained 
for the first, third, and fourth time periods. 
However, “uh-uh” was effective in differen- 
tiating the punished high need-for-approval 
group from its nonreinforced counterpart 
only during the fourth time period. The effect 
of the experimenter’s verbal punishment was, 
thus, not as consistent in producing between- 
group differences in plural noun usage as the 
effect of approval. 

A finding of potential interest is the consist- 
ent though nonsignificant elevation of the 
curve of the low need-for-approval group over 
that of the high group under the control 
condition. It is unlikely that subjects with a 
low need for approval are characterized by 
more pluralistic speech than approval moti- 
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TABLE 2 
MEANS AND STANDARD DEVIATIONS OF THE PLvRar 
Noun Ratios oF High AND Low NEED Approyaz 
GROUPS UNDER THE THREE REINFORCEMENT 


CONDITIONS 

5-minute periods 
Group . . ahiiaianaiiieas 

1 2 3 4 

Positive reinforcement 

High need approval M .197 300 | .272 | .229 | .256 
| SD | .116 | .202 | .223 | .143 | .199 
Low need Approval M -157 | .179 | .149 | .143 | 431 
SD} .149 .128 -135 -109 098 


Negative reinforcement 


High need Approval | M -117 | .106 | .093 | .082 | .097 

| SD | .063 | .077 | .098 | .069 | .134 

Low need Approval M | .194] .155 | .176 | .128 | .143 
SD | .146 153 -146 | .085 | .125 

Nonreinforced control 

High need Approval Wf -187 | .155 | .144 | .134)} .117 
SD | .099 | .115 | .129 | .101 | .007 

Low need Approval Mf -180 | .194 | .166 | .138 | .148 


SD | .120 | .138 | .114 | .111 | .181 


vated subjects, and this difference may be 
attributable to sampling error, minimal dif- 
ferences in task orientation of the two groups, 
or some other presently unspecifiable variable. 

Table 2 shows the means and standard 
deviations of the high and low need-for-ap- 
proval groups under the three experimental 
conditions. 

Among the results of the study is the com- 
plex (cubic) nature of certain of the curves, 
notably that of the high need-for-approval 
group under positive reinforcement. This 
curve is characterized by increase, decrease, 
and a terminal increase. While this observa- 
tion does not bear on the hypothesis, it poses 
the interesting problem of why the decrement 
in the proportion of plural nouns uttered 
occurred in the periods following the second 
5 minutes. Conceivably, under the conditions 
of the plural nouns paradigm, there is an 
asymptotic limit to the number of plurals 
a subject can give relative to his total pro- 
ductivity. Hence, this may be in the nature of 
a partial, temporary “exhaustion” phenom- 
enon, from which recovery appears to occur 
in the final period. It may also be noted that 
the nonreinforced low need-for-approval group 
tended to increase slightly in the production 
of plural nouns from the first to the second 
period. Similarly, Daily (1953) found that his 
nonreinforced group showed an increment 1 
the response class investigated in his study. 

It remains to examine an alternative ¢ 
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planation of the results. It is not altogether 
impossible that the obtained differences in 
verbal conditioning are a function of intelli- 
gence. Such « finding would occur if there 
were a relationship between intelligence and 
the M-C scale. Scores on the Ohio State 
Psychological Examination (OSPE), a scho- 
lastic aptitude test highly related to other 
measures of intelligence, were available on 
34 of the positive reinforcement subjects and 
31 of the subjects in the negative reinforcement 
condition. The mean OSPE scores of the high 
and low need-for-approval groups were sepa- 
rately compared for the two reinforcement 
conditions. For positive and negative rein- 
forcement analyses, the obtained / ratios were, 
respectively, 0.56 and 0.23, providing no 
evidence of differences in intelligence between 
the two groups. Finally, to determine if any 
relation exists between intelligence and verbal 
conditioning, correlations were run between 
OSPE scores and a total conditioning score 
obtained by summing over the time periods. 
For the positive reinforcement condition, r 
= —.06, and the corresponding r for the 
negative reinforcement condition is —.04. It 
is evident that intelligence, as measured by 
the OSPE, is not predictive of verbal con- 
ditioning. 
DiscUSSION 

The results of this experiment establish 
that subjects whose need for social approval 
is high, as compared with subjects less con- 
cerned with approval and nonreinforced con- 
trol subjects, tend to increase the relative 
frequency of the reinforced response class of 
plural nouns under positive reinforcement and 
tend to inhibit plurals when they are followed 
by punishment. In fact, the low need-for- 
approval groups failed to demonstrate con- 
sistent changes in rate of response either under 
positive or negative reinforcement when com- 
pared with the control subjects employed to 
establish the base rate of plural nouns. 

The findings suggest that, in the context 
of this experimental procedure, a reinforcer 
connoting approval is more effective in increas- 
ing the rate of response than is punishment in 
inhibiting response rate for subjects to whom 
approval satisfactions are important. This 
tendency was in evidence despite the fact that 
disapproval was in general far more salient 
'o the subjects and for many was highly 
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anxiety arousing, as shown by their questions 
and comments, increased response latency, and 
other behavioral indications of disturbance. 
There were, however, no differences between 
high and low need-for-approval groups in 
these manifestations of anxiety as assessed by 
observational ratings of their behavior during 
the conditioning period. 

For all groups, there was considerable vari- 
ability in conditioning. While the need for 
social approval clearly seems to be one deter- 
minant of change in behavior in this experi- 
mental situation, it is undoubtedly not the only 
variable involved. Other studies of individual 
differences in “conditionability” or ‘respon- 
sivity” have predicted change in response 
rate from such diverse personality measures 
as manifest anxiety (Taffel, 1955), compliance 
in psychotherapy as well as test anxiety and 
fearfulness in new situations (Sarason, 1958), 
achievement via independence (Krasner, Ull- 
mann, Weiss, & Collins, 1960), and hypnotiza- 
bility (Weiss, Ullmann, & Krasner, 1960). 
It is important to determine if these and other 
personality variables operate in a meaningful 
constellation and whether prediction can be 
increased by employing multiple measures. It 
remains to be determined what relations obtain 
among these personality variables, and efforts 
to achieve theoretical articulation are needed. 

As to the problem posed by this study, the 
conceptualization of need for social approval 
is considerably enhanced by the findings which 
provide support for the theoretical bridge 
between an individual’s attribution of socially 
approved characteristics to himself on a per- 
sonality inventory and the inference of strength 
of approval motivation from the degree of 
this endorsement. The need for approval was 
demonstrated to conform to a major attribute 
of needs: the ability to effect modifications of 
behavior and, in particular, to influence the 
effectiveness of a reinforcer. 

It appears from the present research and 
other investigations (Allison & Hunt, 1959; 
Couch & Keniston, 1960) that response sets on 
personality measures involve more than a 
simple tendency to agree, dissimulate, or 
respond in a socially desirable fashion in re- 
stricted self-evaluative situations. There is 
increasing evidence to indicate that a subject’s 
needs in a testing situation and his categoriza- 
tion of that situation affect his test responses 
(Rotter, 1960), and these needs may show 
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considerable generality beyond this context. 
The present findings and previous research 
on the need for approval are suggestive of the 
extent of this generality, as is Couch and 
Keniston’s research on agreeing response set. 
Here, as with studies of verbal conditioning, 
research to determine the relations among the 
various response-set measures is required. Of 
particular interest in this regard is Couch and 
Keniston’s characterization of the naysayer, 
who demonstrates a “denial” response set, as 
conventional and conforming in his attitudes. 
These characteristics are also descriptive of the 
individual with a high need for social approval. 

It is important to note that no tautology is 
involved in this study in the finding that ap- 
proval motivated subjects increase their rate 
of response under conditions of approval ori- 
ented reinforcement. Subjects were carefully 
screened for awareness of the correct response, 
and only those subjects who were unable to 
recognize the response-reinforcement sequence 
were retained for analysis.? Thus, the effect 
of the need for approval was mediated at a 
level of awareness below that which the sub- 
ject was capable of verbalizing. In the main, 
subjects, both those high and low in need for 
approval, tended to characterize the experi- 
menter’s “‘mm-hmm”’ or “uh-uh” as gener- 
alized encouragers or discouragers, or saw these 
reinforcers as pertaining to specific content 
categories which they were to pursue or to 
abandon. There were no differences between 
high and low groups in this respect.* 

? The following comments of subjects are quoted to 
illustrate their lack of awareness of the correct rein- 
forcement contingency. 

Positive reinforcement, to Question 5: “I would 
say it encouraged me more than discouraged me. I 
knew you were approving—I completed that word, 
it was successfully done.”’ 

Positive reinforcement, to Questions 4 and 5: 
“Yes, to certain of my reactions you would say ‘mm- 
hmm,’ but I don’t know why—not unless you en- 
joyed the word.” 

Negative reinforcement, to Question 5: “I don’t 
know. That really shakes you up. You seemed to 
say it whenever what I said didn’t have any associa- 
tion with something I'd said before.” 

* Eriksen (1960) has recently reviewed an experi- 
ment by Dulaney (1961) in which a relationship was 
found between the hypotheses held by subjects in a 
verbal conditioning task and their rate of plural noun 
responses. Verbal conditioning occurred only for those 
subjects who interpreted the experimenter’s “mm-hmm” 
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Generalized to the area of psychotherapy, 
the results of this study are suggestive of the 
kinds of problems that may be encountered 
by therapists attempting to deal with the 
ubiquitous problems of resistance and trans. 
ference. The resistance of some patients to the 
therapist’s efforts may be due in part to their 
low need for the approval he dispenses and 
perhaps stronger needs to defend against | 
anxiety arousing thoughts and to preserve 
partially satisfying, if neurotic, modes of 
behavior. At the opposite extreme may be 
those highly approval motivated patients for 
whom the therapist has high reinforcement | 
value indeed. Perhaps in psychotherapy these | 
are the patients rated as good by their thera- 
pists—patients who develop a strong positive } 
transference and are sensitive to very subtle 
nuances in the therapist’s behavior. 





SUMMARY 
This experiment was undertaken to compare 
the changes in response rate of subjects 
differing in the strength of need for social 
approval on a plural nouns verbal condition- 
ing task. The hypotheses were that subjects 


as an indication that they were correct in pursuing 
certain content categories. Further, it was shown 
that while there is a tendency for plurals to evoke 
other plurals, this is not true of singular associations 
Dulaney concludes that the relationship between the 
chaining of plural nouns and the subject’s belief that 
reinforcement was pertinent to content categories is 
sufficient to account for the (spurious) increase in rate 
of plural nouns in that increase in plural noun output 
is only incidental to increase in associations along rein- 
forced content lines. The present data were analyzed to 
test Dulaney’s hypothesis. Subjects were divided into 
two groups on the basis of their responses to the aware- 
ness questioning: a group containing 18 subjects who 
verbalized a relation between ““mm-hmm” and correct- 
ness of associations along content lines, and a group of 
26 subjects who gave any other account of “mm-hmm” 
including lack of recognition. These groups were further 
broken down into high and low need for approval sub- 
groups and their plural noun ratios compared for the 
second time period in which the maximum rate oc * 
curred. No difference was obtained between high and 
low need for approval groups in the explanation of 
“mm-hmm,” and ?’s computed between the mean 
plural noun ratios of the explanation groups were non- 
significant. A similar analysis conducted for the nega 
tive reinforcement data yielded nonsignificant results 
Thus, Dulaney’s contention that the verbal condition- 
ing of plural nouns is artifactual does not appear to ap 
ply to the present experiment. 
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with a high need for social approval, in con- 
trast to subjects with a weaker approval need, 
tend to show an increase in the proportion of 
plural nouns under positive reinforcement and 
a decrease in the proportion of plural nouns 
under negative reinforcement. The reinforced 
groups were also compared with a nonrein- 
forced control group employed to establish 
the base rate for plural nouns. 

Results supported the hypotheses, with the 
high need-for-approval groups increasing the 
positively reinforced response and inhibiting 
the response followed by negative reinforce- 
ment. No subjects retained in the analysis of 
the data were able to verbalize the reinforce- 
ment contingency. The significant differences 
obtained were found not to be attributable to 
intelligence, nor was intelligence related to 
verbal conditioning. 

The findings were interpreted as providing 
support for the inference of need for social 
approval from a personality inventory meas- 
uring the degree of personal endorsement of 
socially approved characteristics. 
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INTERACTION AMONG RETARDED CHILDREN AS A FUNCTION OF 
THEIR RELATIVE LANGUAGE SKILLS! 
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HERE has been very little direct analysis 
of the interpersonal processes associated 
with behavioral retardation. A few 

authors discuss the possibility that inter- 

personal processes play an important role in 
psychopathology (e.g., Cameron & 

Magaret, 1951, Ch. 6; Levinson, 1958; 

McCandless, 1952; Masland, Sarason, & 

Gladwin, 1958, Ch. 14). It-is conjectured that 

the social environment is an important deter- 

minant in the inception of a process of be- 
havioral retardation and that interpersonal 
relationships between a _ developmentally 
retarded child and his social environment 
affect the degree of behavior pathology. How- 
ever, as some writers have noted, these inter- 
pretations are based primarily upon studies 
gross institutional up- 
socioeconomic status of the family, 


such 


of such variables as 
bringing, 
regional differences, maternal care, and on 
records of a few accidental cases of extreme 
social isolation. Psychological descriptions of 
the probable nature and relevance of inter- 
personal relationships to behavior development 
are, at present, inferred largely from these 
sociological variables and from _ clinical 
protocols. 

There are very few specific facts concerning 
the details of natural or contrived interpersonal! 
events involving the behaviorally retarded. 
There is the general suggestion that inter- 
personal behavior of a retarded person is not 
determined exclusively by his own various 
skill levels, but also by the skills of the other 
interactors. Somewhat more specific is the 
notion that the discrepancy of skill between 
participants is important in determining 
interpersonal action. The present study was 
designed explicitly to ascertain the importance 
of discrepant levels of linguistic skill between 
of research in 
sponsored by NIMH _ Grant 
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participants, as well as absolute levels oj 
skill, in determining the amount and type of 
interpersonal behavior exhibited. 

Pairs of children, both of homogeneous and 
of heterogeneous levels of linguistic skill, 
were assembled for observation. Detailed 
quantitative records made of inter- 
personal behavior over a series of free play 
situations. All children had been _instity- 
tionalized as a result of demonstrated be- 
havioral retardation and associated psycho- 
pathology. The composition of each dyad was 
systematically varied according to the 
linguistic skills demonstrated by each child in 
preliminary individual Although all 
subjects were institutional residents, a wide 
range of talent on the tests was readily found. 

The use of psychometric tests to measure 
skills has been successful both in the area of 
retardation and in social psychological re- 
search. The important relationships between 
the psychometric score (intelligence or other 
and the interpersonal behavior of an individual 
may not always be linear, however. A number 
of empirical studies in social psychology, 
dealing with social interaction among normal 
adults (Hoffman, 1959; Rosenberg, 1957; 
Schutz, 1958), suggests that individual test 
measures obtained before interaction relate, 
but not in simple ways, either to subsequent 
social performance by the individual or 
to subsequent group products. It is also quite 
apparent from the literature that the relation- 
ships between psychometric scores and inter- 
action measures have not been highly 
systematized. Although this body of research 
contains certain commonalities, generalizations 
are at present limited almost exclusively to 
populations, and_ inter 


were 


tests. 


specific measures, 
personal settings (see, for example, cautions 
by Shaw, 1960). In effect, the present study 
may be viewed as an initial attempt to deter- 
mine the presence and type of an assembly 
1953; Rosenberg, Erlick, & 
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INTERACTION AMONG RETARDED CHILDREN 


METHOD 

Psychometric Tests of Linguistic Skill 

Children were tested one at a time with a standard 
set of language items.’ The test items were divided into 
two categories, an Intraverbal subtest and a Naming 
subtest. The Intraverbal subtest contained 26 items, 
an example of which is, “The flag is red, white, and 
——.”’ The child is asked to complete the sentence. 
The Naming subtest contained 28 items and consisted 
of various objects, ranging in difficulty, which the 
child is asked to name. A child receives one point for 
each item named correctly. A child who failed to attain 
a score of at least one on both subtests combined was 
excluded from the study. For the remaining children, 
only the scores from the Intraverbal subtest were con- 
sidered in classifying them in level of linguistic skill. 
Two levels were specified: “High” (H) children were 
those who obtained a score >20 on the Intraverbal 
children were those who obtained a 
score <9 on the Intraverbal subtest. Children with in 
termediate scores were eliminated. Tie verbal IQ 
scores on the WISC for the H subjects ranged from 45 
to 82. None of the L subjects could be tested with the 
verbal part of the WISC. The correlation between the 
Intraverbal subtest and the WISC verbal is .70; the 
correlation between Naming and the WISC verbal is 
05. The low latter correlation is readily explained in 
terms of the low overlap in relative difficulty; most 
items on the WISC are empirically more difficult than 
those in the Naming subtest. These correlations are 
based on an NV of 34 for whom WISC scores were avail- 
able from the case files. These 34 children are part of 
the sample of 64 to be described below. 


subtest. “Low” (L 


Subject Selection 

All subjects were selected from the institutional 
population of Parsons State Hospital and Training 
Center, Parsons’ Kansas. This institution is for “‘emo- 
tionally disturbed, mentally retarded children without 
multiple handicap” between the ages of 6 and 21. The 
total number of children in the institution has varied in 
recent years between 619 and 672. There is considerable 
variation within the population in characteristics such 
as self-care, length of institutional residence, psycho 
pathological behavior, obvious biological involvements, 
and intellectual range. The institutional staff carries on 
continuous programs in education, psychotherapy, vo- 
cational habilitation, and a number of the adjunctive 
therapies 

At the time of the study, th 
139 children between the ages of 12 years 11 months and 
1S years 0 months. From this group, 64 children (30 
boys and 34 girls) were randomly selected for testing 





institution contained 


* The test format and items used in the present study 
have been deposited with the American Documentation 
Order Document No. 6867 ADI 
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and ciassified according to the procedures described 
above. All tests were administered within a 6-week 
period. Eight males and 13 females were classified as H; 
6 males and 11 females were classified as L. Fourteen 
children, 8 girls and 6 boys, were randomly selected as 
potential subjects from each of two categories, making 
a total of 28 children initially selected. The final re 
sults are based upon 20 of these children because an 
additional criterion, to be described subsequently, was 
imposed during the experiment proper. 


Experimental Design 

Seven subgroups of four children each were formed. 
Each subgroup contained two H and two L subjects of 
the same sex. Aside from these two restrictions, assign 
ment to the subgroups was random. Three male and 
four female subgroups were thus obtained. All experi- 
mental assemblies of dyadic play groups took place 
within a subgroup. Thus each subgroup constituted 
an independent replication of the same experimental 
design. 

The two H and two L subjects in each subgroup 
were randomly assigned to a row or a column of the 
following matrix with the restriction as shown. A, A’, 
B, .nd B’ represent dyads formed from one row sub 
ject and one column subject 


COLUMN SUBJECTS 


H L 
ROW SUBJECTS H A B’ 
L B A’ 


rhe four subjects in a subgroup were brought to the 
research area of the hospital at the same time. Two 
dyads, each containing one row subject and one column 
subject, were then formed and placed in two separate 
playrooms for 15 minutes. For example, dyads A and 
A’ were formed and each put into separate rooms. Two 
new dyads were then formed from the same four sub 
jects after this 15-minute period. Again each dyad was 
composed of one row subject and one column subject. If 
A and A’ had been formed first, B and B’ would be 
formed for the second period of play. The four cells of 
the matrix above were thus completed. Two more play 
periods for the same four subjects followed immediately 
Che order of assembly was not necessarily the same, 
however. Two completions of the matrix constituted a 
“session.”” Row subjects were never assembled with 
each other. Similarly, column subjects never formed an 
experime ntal dy ad 
Figure 1 is a sketch of the floor plan of the playrooms 
in which two dyads of a subgroup were simultaneously 
assembled. An observation room with a one-way mirror 
to each playroom is also shown in the sketch. Playroom 
was wired so that the subjects could be readily heard 
in the observation room. All behavior recordings were 
done with the dyad in Playroom I. Thus each dyad 
spent one play period in each playroom during a session. 
rhe assignment of dyads to the playrooms also satis 


fied the following restriction: If the subject was not in 
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Fic. 1. Sketch to scale of the two playrooms and ob- 
servation room. (D is a door to hall; W is an outside 
window; M is a one-way mirror; O,, O2 are locations of 
observers.) 


a dyad in Playroom I during one of the first two 15- 
minute periods during one session, he was required to 
be in a dyad in Playroom I during one of the first two 
15-minute periods of the next consecutive session. Each 
playroom contained one table, two chairs, one beach 
ball, one teddy bear, and one package of colored clay. 
The windows of the playroom were covered with drapes. 

Thirteen sessions were scheduled for each subgroup. 
The first 8 sessions were scheduled 2-4 days apart. If 
one or more children in a subgroup were absent for a 
session, the session was postponed until the next sched 
uled period. The first session started 3 weeks after the 
psychometric test, and all subgroups were started in the 
same week. After the first 8 sessions, the sessions were 
scheduled once each week. The 13 sessions which were 
to have required 9 weeks actually required up to 27 
weeks in some subgroups because of illnesses and Christ- 
mas vacations. Two of the seven subgroups were 
dropped after the third session because one child within 
each of these two subgroups refused to remain in the 
playroom for the scheduled time. Some preliminary but 
imperfect screening on this characteristic had probably 
already occurred during psychometric testing. Fre- 
quently, a child made a zero score because he did not 
remain in the test room even after several attempts to 
attract him to the tests. As noted previously, such ex 
tremely “low” children were excluded before subgroups 
were formed. 


Behavior Measures 


Two observers (Q;, Os) in the observation room 
categorized the behavior of the subjects in the play- 
room. An O never observed the same subject two 
times in succession. Within a play period, one O ob 
served a row subject and the other O observed a column 
subject. The Os were required to judge the occurrence 
of three individual behavior categories. Three electrical 
buttons, one for each category, were available to each 
O, who was instructed to depress the appropriate key 
when he observed his subject behave in the defined 
manner and to release the button as soon as the be- 
havior ceased. Records of O’s judgments were made on a 
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20-pen Esterline-Angus operations recorder. The be. 
havior categories were described to the Os as follows 

1. Vocalization: All verbalization excluding shriek 

ing, screaming, crying, laughing, and other re 
flexive noises. 

2. Gesture: Commonly learned gestures clearly ob 

served by the other child. Include only 

Beckoning with finger or hand 

Shrugging 

Pointing 

Nodding yes or no 

Waving 

3. Physical Contact: Includes actual grasping of 

part of other child with hand, touching, hitting, 
pushing, chasing, offering of toys toward other 
child, throwing ball or object toward other child 
Mutual hugging of an object. This excludes rock 
ing. 

The last 5 of the 13 sessions were systematically ob- 
served by the two Os. Only data from the last 4 sessions 
were analyzed. The first 9 sessions were used to allow 
the relationships in the dyads to stabilize. Effects of un 
equal prior contact among the subjects due to institu 
tional routine were partially neutralized this way 

Three people were selected as Os for the study, but 
only 2 were used at a time. They were 2 junior college 
students and 1 housewife, selected from a preliminary 
group of 6 housewives and 12 students. Final selection 
of the 3 Os was made on the basis of their reported 
availability and interest after a preliminary contact 
with the task (i.e., self-selection as Os) and the accuracy 
with which they used the categories during observations 
of a film.‘ Thirteen persons were eliminated on the first 
criterion. 

On the second criterion, after preliminary instruc- 
tions concerning the behavior categories, the complete 
film was shown to the potential Os. During this period, 
they did not rate the children. This showing was then 
followed by a number of repetitions of the first 5 min- 
utes of the film, which contained more communicative 
interaction among the children than other parts. The 
potential Os were asked to observe one of three different 
children during each of three 5-minute repetitions. In a 
fourth run, the experimenter demonstrated his judg- 
ments. Questions by Os were then freely answered 
Three repetitions then followed with the potential Os 
assigned to a different child each time. The selection of 
Os was based on these three final repetitions. The 3 Os 
whose ratings of the children most closely approximated 
the mean ratings of the original 18 Os were selected to 
rate the sessions. Two of these Os received additionai 
practice in rating the dyads of each subgroup for the 
sixth and seventh sessions. These 2 Os rated all the 


4A 20-minute film titled, One, Two, Three, Go, pro 
duced by Metro-Goldwyn-Mayer, 1946. The film is 
readily available through Teaching Films Custodian, 
Inc., 25 West 43rd Street, New York, N. Y. One, Two 
Three, Go is an “our gang”’ type of safety film. One ol 
the gang is injured by being hit by an automobile 
while chasing a ball in a sand-lot baseball game There 
is considerable child-child interaction during portions of 
the film where the rest of the gang is making the trip to 
visit him in the hospital. 
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sessions with the exception of one session for each of two 
matrices. For these two sessions, the third O served as 
an alternate. 

RESULTS 

Three measures were obtained for each 
subject on each of the last four trials: total 
frequency of vocal responses during a 15- 
minute play period, total frequency of gestural 
responses during the play session, and square 
root transformation (+/x + .5) of the total 
frequency (x) of physical contacts during the 
play session. The transformation of frequency 
of physical contacts was necessary because of 
extreme skewness in the original frequency 
counts (Edwards, 1950, pp. 199-202). 

The individual behavior measures obtained 
for each subject were classified according to 
one of the five subgroups (matrices) and one 
of the four sessions. Furthermore, each cell of 
each 2 X 2 matrix contains two values, one 
designating the behavior of the row subject 
and one of the column subject. 

The response data of row and column 
subjects were analyzed separately. The 
separate analyses of row and column subjects 
do not constitute two independent replications 
of an experiment, however. The presence of 
one member of each dyad in one set of data 
and the other member in the other set may 
introduce systematic similarities between the 
two sets of results. The data of the row sub- 
jects were arbitrarily selected as “primary” 
data. The column subjects’ data were analyzed 
and are presented to permit comparisons of 
results from the two sets of individual be- 
havior data. 

Each response measure of the row subjects 
was subjected to a separate analysis of vari- 
ance. The statistical design was the same for 
all three measures. The nature of the design 
and the results for row subjects are sum- 
marized in Table 1. The row subjects in these 
analyses are termed “speakers” and the column 
subjects are termed “‘listeners.”’ 

The three variance sources of greatest 
interest are those attributable to speaker’s 
linguistic skill (Source 1), listener’s linguistic 
skill (Source 2), and the statistical interaction 
of listener’s and speaker’s skill levels (Source 
5). Each of these effects was tested (Table 1) 
against two error terms. One type of error 
term permits generalization across trials 
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(Sources 6, 8, or 11). The second and more 
important error term permits generalization 
across individual matrices (Sources 7, 9, 12). 
Both Fs for each variance source are shown in 
Table 1. 

Table 2 presents the analyses of response 
measures obtained from column subjects. In 
general, the results are comparable with those 
obtained from row subjects. With little 
exception, the two main effects, speaker’s skill 
or listener’s skill, are generally not significant 
in any of the analyses. It is obvious that 
linear relationships between psychometric 
score and individual situational performance 
fail to obtain here. On the other hand, the 
statistical interaction effects in both tables are 
highly significant for vocal responses and 
generally significant for gestural responses. The 
least stable measure, in terms of significant 
interaction effects, is transformed physical 
contact. 

Figure 2 presents the data graphically. The 
three graphs on the left of the figure are data 
from row subjects; the graphs on the right are 
from column subjects. Examination of the 
graphs in Figure 2 reveals the nature of the 
sources of variances tested in Tables 1 and 2. 
The top two graphs of Figure 2 clearly show 
that vocal productivity of a subject is related 
directly to the similarity of his psychometric 
score with that of the listener or recipient with 
whom he is assembled. This is a relatively 
simple but important interaction. It is not 
removable by any transformation of the 
measure (Scheffe, 1959, pp. 95-98). The graphs 
give very little indication of any main effect. 
It is also interesting to note that in dyads 
composed of heterogeneous skill levels (H-L 
and L-H), the absolute frequency of vocal 
responses by either child is almost zero in all 
sessions. 

The gestural responses which are plotted in 
the second two graphs of Figure 2 suggest a 
similar type of statistical interaction. The 
effect is not as striking as that obtained for 
vocal data, but it is apparent that a dyad 
composed of two L subjects yields as much 
gestural behavior, on the average, as does a 
dyad of two H subjects, and that main effects 
are minimal. Again, heterogeneous dyads show 
very little gestural behavior in the absolute 
sense. 
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TABLE 1 
SUMMARY OF STATISTICAL ANALYSES OF THREE RESPONSE CLASSES OF ROW SuBJEcTsS 
yy: Vocal behavior Gestural behavior Physical contact I 
c df of each rror } 
ouree analysis terms 
MS I MS F us F for F 
. “Speakers” 1 289 1.25 0.0) <1 1102 <1 6 
(Row subjects) <1 <i 5.11 7 
2. “Listeners” 1 1141 17.83* 5.0 | <1 1436 3.97 8 
(Column subjects) <1 <1 14.37* 9 
3. Trials 3 242 15.5 404 
4. Matrices 4 1796 48.2 85 
» 3X32 1 83980 | 272.66*** | 871.2 > f° ae 3.75 11 
90. 11°"* 20.84* | 2566 51.30* 12 
6.1X 3 3 232 32.8 1379 
7.14 4 12899 108.5 216 
8. 2x3 3 64 30.8 363 
92x 4 4 12292 114.1 100 
10.3 xX 4 12 1058 22.7 561 
2 §X23Xx3 3 308 15.2 684 
12.1X2x4 d 932 41.8 50 
13.1X3X4 12 440 17.9 231 
144.2X3xX4 12 405 15.5 580 
1.1X2xX3xX4 12 1047 21.6 564 
16. Total 79 2975 42.3 528 } 
. p < os 
** p o1 
eee > 001 
rABLE 2 
SUMMARY OF STATISTICAL ANALYSES OF THREE RESPONSE CLASSES OF COLUMN SUBJECTS 
(A Nonindependent Replication) 
Vocal behavior Gestural behavior Physical contact E 
df of eact rror ‘ 
source analysis for F 
MS F MS F MS F 
1. “Speakers’”’ 1 16 <1 9.8 | <1 1133 8.58 6 
(Column subjects) <1 <1 7.36 7 
2. “Listeners” 1 328 1.47 68.5 9.01 567 <1 8 ’ 
(Row subjects) <1 <1 2.32 9 I 
3. Trials 3 104 25.5 651 
4. Matrices 4 4392 23.3 521 
BB axX2 1 57460 | 174.12*** | 336.2 5.29 | 2236 4.84 11 I 
442.00*** 18.37* 9.98* 12 r 
‘333 3 114 16.9 132 
eee 4 8831 144.6 154 
8 2x3 3 223 7.6 870 c 
92x4 4 6351 80.5 244 r 
10.3 X 4 12 526 27.2 370 = 
1.32xX2xX3 3 330 63.6 462 . 
12,.1xX2x4 4 130 18.3 224 \ 
1I3.1xX3x4 12 586 20.3 476 : 
14.2X3xX4 12 343 23.2 196 
iS. i xX2x3x 12 197 16.9 ‘260 0 
16. Tota! 79 2009 36.4 386 
. . 
°ea< J T 
mH < J ny 
een < WI 
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Fic. 2. Graphic summary of the results showing 
each of the three response classes for row and column 
subjects. (The symbols, e.g., H-L, refer to response data 
of a high “speaker” assembled with a “low”’ listener; 
other symbols are similarly interpreted.) 


contacts in the other three compositions are 
near zero (4/ x + .5, when x = 0 is .7). Note, 
however, that this statistical interaction effect 
is not stable across trials, i.e., is not significant 
when tested with Source 11. 

The similarity of findings between row and 
column subjects’ data is not surprising. The 
findings are not expected to be exactly the 
same with only a few subjects in the analyses. 
Nevertheless, even with small Ns, the two 
sets of data will appear more similar to each 
other than would two independent sets if we 
make the likely assumption that the inter- 
personal process is such that if one person is 
high or low in vocal, gestural, or other meas- 
ured behavior, the other person is equally 
productive. If, on the contrary, there is a 
systematic inequality in the response outputs 
of the two members, and if it occurs in all 
conditions, it will tend to inflate one of the 
error terms used in the present analyses. 


However, the inflated error term does not 
render equivocal any significant findings 
obtained from such analyses. Unequal output, 
if it occurs only in some conditions, will tend 
to produce heterogeneity of variance (although 
it is by no means the only “‘reason”’ for hetero- 
geneous variance) or contribute to statistical 
interaction of speaker XX _ listener levels. 
Significant results from analyses under such 
conditions are not artifactual, either, since 
asymmetric output in some combinations and 
not in others may be viewed as a possible 
assembly effect in a group product “measure,” 
the difference in productivity between the two 
participants. 


DISCUSSION 


The present study is a modified replication 
of a preliminary study (Rosenberg, 1959), in 
which similarly striking results were obtained 
for vocal data, using only two matrices of 
four children each. The present experiment 
was undertaken to determine whether or not 
the preliminary results were spuriously high. 
We speculated that these findings, based on 
the first five sessions only, might have been 
due to routines in the hospital which fostered 
or imposed more contact between children of 
similar abilities than between children of 
different abilities, and that such differential 
contact resulted in more spontaneous social 
interaction among the homogeneous children. 
There is no satisfactory way of actually 
measuring prior contact among institu- 
tionalized children, but it seems unlikely from 
a knowledge of the institution that the almost 
complete absence of action shown by the 
subjects within the high-low pairs mirrored 
with any accuracy the frequency of contact 
imposed by institutional routines. 

One of the important changes in experi- 
mental design, therefore, was the introduction 
of the preliminary sessions among the children 
in the new sample before systematic behavior 
measurement undertaken. One could 
assume, of course, a process whereby children 
gradually interact exclusively with those 
whom they see more frequently in their 
institutional routines. While the preliminary 
sessions did not eliminate the possibility that 
hospital routine per se imposed such a rela- 
social structure, it did 


was 


tively immutable 
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eliminate any effects which might operate 
only during initial contacts between previously 
unacquainted children. 

If it had been possible to assure no prior 
contact among all subjects, the observation 
and analysis of all sessions would have been 
extremely interesting. However, interpreta- 
tions of initial temporal effects would be more 
or less equivocal with the subject population 
of this study. If the interaction patterns 
stabilized rather early, as the preliminary study 
seemed to indicate, it would not necessarily 
follow that strangers discriminate each others’ 
skills quickly and interact accordingly. Even 
relatively slow stabilization of interaction 
pattern (say, after 5-10 sessions) may not be 
representative of temporal effects between 
children who are known to be complete 
strangers. Research with children who are 
initially unacquainted would throw light on 
the temporal development of the interaction 
patterns as well as on the possibility of 
immutable institutional structuring as respon- 
sible for the effects obtained here. 

Even in dyads for which absence of prior 
contact is assured, the behavior of the subjects 
may be a generalization based upon greater 
familiarity with relatively homogeneous 
children. That is, if the stranger is similar in 
various psychological or physical dimensions 
to the child’s frequent contacts, high inter- 
action will be expected to occur. For non- 
institutionalized subjects, cultural segregation 
correlated with individual differences in skill 
may exist in the school, the neighborhood, the 
family, etc. In the case of the present insti- 
tional study, rapid generalization to a strange 
child may be based, in part, upon the physical 
appearance of the low children. That is, 
physical deformities are more likely to exist in 
an L group than in the H group of children. 

Of the 10 L children in this study, 7 were 
judged as having a physical deformity. None 
of the 10 H children were so judged. Two 
research staff members made independent 
judgments of physical appearance and rated 
4 Ls as mongoloids, 2 Ls as cerebral palsied 
(only one judge), and 1 L as having extreme 
protruding lips and a small, pointed cranium. 
Nevertheless, inspection of the data indicated 
that H children emitted as many or more 
vocalizations and gestures when assembled 
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with the 7 deformed Ls as with the remaining 
3 Ls. Deformed Ls were likewise indistinguish- 
able from the other Ls in the amount of 
behavior they emitted in the presence of Hs. 
Generalization based upon obvious physical 
deformities may, of course, be most pronounced 
in initial sessions. For the present, we have 
satisfied ourselves that the laboratory effect 
was stable and not due to absolute absence of 
contact between heterogeneous subjects. The 
results are important whether interpreted as 
an interpersonal process likely to occur spon- 
taneously or as a process initially imposed by 
external social structuring. 

The results of the present study are com- 
patible with the commonly held notion that 
homogeneity of certain individual psycho- 
logical properties plays an important role in 
interpersonal attraction. However, most 
systematic work relating individual char- 
acteristics to interpersonal choice and inter- 
action has been performed with normal adults. 
Newcomb (1956), for example, has examined 
the relationships between attitudinal agree- 
ment and interpersonal attraction among 
college students and has suggested that these 
relationships are close and important. The 
contrasting hypothesis of complementarity of 
psychological characteristics is exemplified by 
Winch’s (1958) work on mate selection. 
Actually, these two general notions are not 
necessarily in conflict. Unfortunately, the 
research on assembly is neither sufficiently 
large nor consistent to provide a systematic 
account of the ways in which individual 
measures may be combined in order to predict 
interpersonal behavior. In any event, the 
population types, the individual measures, 
and the interactional contexts for most 
theoretical discussions in this area are quite 
different from those of the present study. 
The present study is not, therefore, a “test” 
for any theory, nor are the results anything 
more than compatible with the gross similarity 
hypothesis. Also, extensive psychological 
interpretations concerning the interpersonal 
dynamics which may mediate homogeneity in 
psychometric while possible, are 
actually almost wholly speculative at the 
present time. To the extent that these inter- 
pretations of dynamics are testable, they 
require considerably more empirical research 
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INTERACTION AMONG RETARDED CHILDREN 


The direction of these empirical efforts may be 
suggested not only by theoretical interpreta- 
tions, but also by the experimental results 
themselves. 

The present experiment certainly suggests 
additional and more detailed experimental 
analysis, employing the same general social 
psychological methodology. First, the details 
of the interpersonal behavior are not well 
known. For example, the distinction between 
intelligible—i.e., culturally normative—and 
unintelligible vocal behavior must be made. 
The preliminary study (Rosenberg, 1959) 
attempted to make this distinction by in- 
structing raters to judge the “intelligibility” 
of each vocal response. The problems of 
reliable measurement and removal of obvious 
rating artifacts were insoluble. It appears to 
be necessary, or at least desirable, to devote 
some research efforts to this measurement 
problem per se. The lumping of all vocalization 
into one category does not detract from the 
fact that vocal behavior, as defined for the 
observers in this study, is social in form. 
Whether it is culturally negotiable or not, the 
“low” children maintain a high rate of response 
in the presence of a social object who also 
maintains a high rate of similar behavior. 
They are somehow able to discriminate quite 
accurately between a peer who is equally low 
from one who is not. 

Second, an expanded range of talent and 
more than two levels of linguistic skill would 
be extremely important in spelling out the na- 
ture of the statistical interaction. 

Third, the previous and present studies both 
limited assemblies to members of the same sex. 
Generalizations are obviously limited to such 
assemblies. The institutional routine does 
decrease the frequency of contacts between the 
sexes. However, it would be surprising if any 
polarization effects would hold in the case of 
heterosexual assemblies, particularly among 
older children. A variety of sexual advances, 
rather than mutual isolation, would certainly 
be expected from older children who are 
necessarily quartered according to sex. Hetero- 
sexual assemblies could change considerably 
the statistical interaction effects discovered 
within the present design. 

Fourth, the free play situation is only one 
of the many types of situations in which inter- 
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action may be observed. Other, more struc- 
tured tasks may be developed for analysis of 
interaction among these children. 

The present results, evaluated against the 
possible cultural processes of segregation which 
perpetuate themselves, imply a number of 
things about the learning situations which 
may obtain for children. These processes may 
set up learning conditions for children which 
are not necessarily optimal with respect to 
their learning potential. Or these social condi- 
tions may be optimal for some children but 
not for others. It is conceivable that institu- 
tional routines may be geared, in part, to take 
advantage of these social processes. The 
composition of children during play periods, 
eating periods, in wards or cottages, and in 
classrooms, are potentially manipulatable. Con- 
siderably more research is necessary, however, 
before the implications of the social process 
for learning become useful. 

Finally, the present methodology would 
seem to offer some interesting possibilities for 
other types of psychopathological process. 
Pairs composed of adults with demonstrated 
pathology of various kinds could be observed 
over an extensive period of time. Pairs com- 
posed of a normal and a pathological person 
may be instituted in some settings also. The 
subtle processes of psychotherapy may be 
viewed as a dyadic situation in which assembly 
effects may play a prominent role. 

SUMMARY 

The present study was concerned with the 
analysis of interpersonal behavior of pairs of 
children over a series of free play situations 
as a function of the linguistic skills of the 
individual participants. Children between the 
ages of 12-15, residents of an_ institution 
because of retardation and associated psycho- 
pathology, served as subjects. Subjects were 
selected on the basis of preliminary psycho- 
metric tests of linguistic skills. Two extreme 
groups, “high” and “low” were obtained using 
these test scores. Subgroups of four subjects, 
containing two high and two low subjects were 
then identified. Within each subgroup dyads 
composed of combinations of high and low 
skill levels were formed and scheduled for a 
series of play sessions in a laboratory playroom. 
The design permitted explicit analyses of 
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differences in social behavior among various 
dyads which were attributable to the skill 
levels of each subject per se as well as to the 
combination of skill levels in the dyad. 

Three responses were measured during play 
by observations through a one-way mirror. 
Responses of each subject in the dyad were 
separately recorded and consisted of vocal 
behavior, gestural behavior, and _ physical 
contact with the other subject. Results with 
the vocal and gestural responses indicated 
that the skill level of the subject per se is not 
significant in predicting his productivity. 
However, the combination of skill levels in 
the dyad is highly significant in the prediction 
of interaction behavior. The vocal and gestural 
productivity of a low subject in play with 
another low is as large as that of a high subject 
with another high subject. In heterogeneous 
assemblies, however, productivity drops almost 
to zero. Results with the physical contact 
measure are not as clear-cut. The combination 
of a low with a low yields the largest frequency 
of response while the other combinations ap- 
pear to be equally small. 

The results were discussed in terms of social 
processes which may be associated with 
behavioral retardation and with the implica- 
tions for learning as a result of these processes. 
Additional research, not only with retardation, 
but with other forms of psychopathology were 
suggested. 
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ACCEPTANCE OF PUNISHMEN 


T AND CHANGE IN BELIEF! 


BERTRAM H. RAVEN ann MARTIN FISHBEIN 


University of California, Los Angeles 


ESTINGER (1957), in his theory of cogni- 

tive dissonance, discusses the effects of 

“forced compliance” on dissonance and 
change in belief (pp. 84-122). A person, asked 
to perform an act that he finds disagreeable, 
may, in some instances, face the alternative of 
losing a reward that is offered for performing 
the act. In other instances, the alternative may 
be accepting a punishment that is threatened 
unless the act is performed. Compliance occurs 
if the negative valence of the punishment or the 
positive valence of the reward is greater than 
the negative valence of the act itself. The more 
similar these valences are, the greater the con- 
fict before decision. Following compliance, 
the individual tends to resolve the incompati- 
bility between his behavior and the negative 
valence of the act that he has performed. 
Similarly, following noncompliance, he must 
resolve the dissonance between his behavior 
and his acceptance of punishment or his giving 
up a reward. The dissonance may be reduced 
by cognitive change. For the noncompliers, one 
means of reducing it would be to increase the 
negative valence of the task, while for the com- 
pliers, dissonance would be reduced if the task 
were seen more favorably, i.e., if the negative 
valence were reduced. 

The tendency to bring one’s private beliefs 
into line with one’s verbal statements has re- 
ceived considerable support (Culbertson, 
1957; Janis & King, 1954; King & Janis, 1956; 
Raven, 1959; Sarbin & Jones, 1955; Scott, 
1957; Sims & Patrick, 1936). Further, a great 
deal of support for the theory with respect to 
compliers has also been reported. Festinger 
(1957) has interpreted data from a study by 
Kelman (1953) to support such a hypothesis, 

' This study was supported by the Group Psychology 
Branch, Office of Naval Research, Contract Nonr 
233(54), and is reported more fully in Technical Report 
No. 3 under that contract. A preliminary report of this 
research was presented at the Annual Meeting of the 
American Psychological Association, Chicago, Septem- 
ber 1960. 

We are indebted to Lon R. Davis for his assistance 
i constructing the equipment which was utilized in 
this investigation, and to Leonard Vosen and Stephen 
Bindman who conducted the experiment. 
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but with the usual trepidation that accom- 
panies pos! hoc interpretation of experimental 
data. Cohen, Brehm, and Fleming (1958) re- 
port that after students were asked to make 
statements contrary to their opinions, they 
tended to change their opinions so as to make 
their opinions consonant with their state- 
ments—especially when no adequate justifica- 
tion was given for compliance. Festinger and 
Carlsmith (1959) were able to secure clear-cut 
support for dissonance reduction by compliers 
in an experiment in which subjects, after en- 
gaging in an unusually boring task, were asked 
to tell succeeding “subjects” that the task had 
in fact been quite pleasant. When they were 
paid $1.00 to do this, dissonance after the act 
was expected to be quite great, but with a 
greater reward of $20.00, dissonance would be 
less since the $20.00 would provide justifica- 
tion for saying something that seemed con- 
trary to fact. In a separate questionnaire, 
that presumably was not to be seen by the 
original investigators, subjects were asked to 
indicate how pleasant or enjoyable the task 
was. As predicted, the high dissonance sub- 
jects ($1.00) were especially likely to rate the 
task as more enjoyable, thus bringing their 
private evaluation of the task into line with 
their statements to the presumed “subject.” 
Other support for reduction in 
compliers may be found in Brehm (1957 un- 
published, 1959, 1960 unpublished) and Brehm 
and Cohen (1959). 

Dissonance reduction among noncompliers 
has received less attention, though here, again, 
Festinger’s interpretation of Kelman’s (1953) 
data is supportive. It is to this derivation from 
Festinger’s (1957) theory that our experiment 
is directed. Specifically, we wished to present 
individuals with a situation in which they 
would be asked to indicate whether they had 
received an ESP image. Some subjects who did 
not receive the image would receive punish- 
ment in the form of an electric shock. Others 
would not receive punishment for failure to see 
an ESP image. Those who received punish- 
ment should experience dissonance between 


dissonance 
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the knowledge that they had not indicated the 
reception of an image and the knowledge that 
they had thus accepted a quite painful electric 
shock (which could have been avoided had they 
reported reception). They should therefore be 
especially likely to show cognitive change so 
as to reduce dissonance. Specifically, we pre- 
dicted that these subjects would be especially 
likely to reduce their belief in the existence of 
ESP; if they were convinced that ESP did not 
exist and that reception was impossible, then 
a failure to report reception would appear 
justifiable even if this led to a painful shock. 


METHOD 
Measure of Belief 


Belief in extrasensory perception was measured on 
the AB scales (Fishbein & Raven, 1959). Suggested by 
Osgood’s semantic differential (Osgood, Suci, & 
Tannenbaum, 1957), this instrument consists of 20 
polar adjectives, 5 of which (the B scale) specifically 
measure belief in the existence of an object (impossible- 
possible, false-true, nonexistent-existent, improbable- 
probable, unlikely-likely). Since each polar pair had a 
seven-point scale, a rated concept could give a belief 
score ranging from 5 (complete disbelief in existence of 
the concept) to 35 (complete belief in its existence). 
All students in a section of introductory psychology 
at UCLA were asked to rate ESP and two other con- 
cepts in class. These ESP scores were taken as estimates 
of belief at the time of the experimental session and 
were used to attempt to match groups in mean initial 
belief score. The belief measure was then given both at 
the beginning and at the end of the experimental 
session. 


Subjects 


Data reported in this paper are based on 52 under- 
graduate subjects, 13 males and 13 females in each of 
two experimental conditions. It was decided before the 
experiment that only data from subjects who had 
initial belief scores within a range from 10 to 30 would 
be considered in testing our hypotheses, since we 
wished to allow for some possibility of change in belief 
in both the positive and negative directions. Seventeen 
subjects were excluded from further consideration on 
this basis. Since we wished to test only the behavior of 
those subjects who did not indicate reception of ESP 
in the experimental situation, an additional 12 subjects 
who indicated that they had received an ESP image 
were not used in testing our hypotheses. The eliminated 
subjects did not come differentially from one experi- 
mental condition as compared to another.? 





2 In addition to these, 11 other subjects were elimi- 
nated from consideration in our data: 5 for failure to 
complete the semantic differential measure, 4 because 
of apparatus failure, 1 for knowing the experimenter, 
and 1 for being suspicious of the experiment. These 
subjects were not disproportionately distributed in the 
experimental conditions. 


Procedure 


Premeasure of belief. When the subject appeared to 
take part in the experiment, he was met by a male 
interviewer. Care was taken that the interviewer did 
not know the experimental condition to which the 
subject would be assigned. Initially, the subject was 
asked to rate the concept “extrasensory perception”’ on 
the AB scales. It was explained to him that “extra- 
sensory perception” involved the “transmission of in- 
formation between people through means other than 
the usual sense organs” and that it was sometimes 
called “mental telepathy.”’ Following this, the subject 
rated the concept “racial prejudice” on the same scales. 
The purpose of the latter measure was to decrease the 
possibility of subjects’ remembering their precise 
responses to ESP. The subject was then taken into an 
adjacent room, for the major part of the experiment. 

Experimental situation. In the experimental room, 
the subject was greeted by a second male experimenter 
and seated at a table which was perpendicular to the 
experimenter’s desk, and facing the door. On the door, 
directly in front of the subject and 6 feet away, a 
20” X 30” white cardboard was mounted on a 30” X 
40” black background. On the subject’s table were 
two telegraph keys mounted on a plywood board, with 
the words Yes and No printed under the left and right 
keys, respectively. On the subject’s left were two copper 
electrodes attached to rubber wrist bands. On the 
experimenter’s desk, there was a plywood board on 
which was mounted a telegraph key, a transformer 
with a dial to regulate amount of current, a buzzer, 
a 20-pen Esterline-Angus multiple-event recorder, two 
6-volt dry cells, and a collection of relays, transformers, 
timers, and toggle switches. 

After the subject was seated, the electrodes were 
fastened to his left hand, one across the fleshy part and 
the other on the wrist. It was explained that his shock 
level would be measured later. Subjects were given 
the opportunity to withdraw from the experiment if 
they wished to avoid shock. Only one subject withdrew 
for this reason. It was explained that the experiment 
was to be a test of extrasensory perception. The experi- 
menter emphasized that although he did not believe 
nor disbelieve in ESP, it was necessary to test its 
possible existence in as many situations as possible. A 
man claiming to be a “sender’’ would attempt to trans- 
mit the image of the word contemporary to the subject, 
and a white card with the word on it in red was shown 
to the subject. (The letters had been traced from the 
cover of Contemporary Psychology.) Periodically, a 
buzzer would sound simultaneously in both rooms, 
during which time the “sender” would concentrate on 
the word on the card, while the subject would concen- 
trate on the white card in front of him and attempt to 
receive the image.’ 

When the subject felt that he received an image of 
the word, he was to push the telegraph key marked 
Yes, as rapidly after the buzzer had stopped as pos- 





3 We selected the word Contemporary for transmis- 
sion because its complexity rendered it particularly 
difficult to image. In pretesting, a large number of 
subjects reported “receiving” a red circle, a red star, 
and several other simple objects even when there was 
no social pressure or threat of punishment. 
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sible. Similarly, he was to push the key marked No 
when he did not feel that he had received an image. 

It was stressed to the subject that he should only 
push the Yes key “‘if you are certain that the impres- 
son that you see is actually being sent to you by the 
person in the other room. If you get no impression, or 
if you feel that the impression you see is merely an 
image you are projecting, without the help of the 
gender, then don’t push the key... . ” 

The subject then went through a series of practice 
trials, during which the experimenter said “Yes” or 
“No” while the buzzer was sounding. This was to give 
the subject practice in waiting until the buzzer had 
stopped and then responding as soon as possible. 

The next step involved getting the maximum shock 
level which the subject could endure. By now, the 
amount of perspiration on the electrode connections had 
presumably stabilized, and the subject was given in- 
creasing amounts of shock and encouraged to accept 
more until he felt that he could not endure any addi- 
tional increase. 

The major ESP test was now to begin. The instruc- 
tions were reviewed and the subject told to respond 
“Yes” for “receive’”’ and “No” for “not receive” as 
rapidly as possible after the buzzer had stopped. He 
was not told exactly how many trials would be given 
him. 
The “sender,” who was presumably in an adjacent 
room, was given a signal to begin his transmissions. 
Each trial took 16 seconds: the buzzer automatically 
sounded for 6 seconds, followed by 10 seconds of silence 
before the next trial. The subject pushed a Yes or No 
key as soon as possible after the cessation of the buzzer. 
An Esterline-Angus multiple-event recorder recorded 
the buzzer signal and also the response, which gave a 
measure of latency of response. All subjects went 
through a series of 12 trials. 

Punishment variation. There were two major varia- 
tions in the experiment: one with high punishment for 
failure to report ESP images—the Shock condition; 
and one with no punishment—the No Shock condition. 
Since the experimenter was not aware of the premeasure 
of belief and since the interviewer was not to know the 
experimental condition into which the subject would be 
assigned, the belief scores which the subject had given 
in class before appearing at the experiment were used 
to assign them to either the Shock or No Shock condi- 
tion. These assignments were made so as to make the 
premeasures of belief approximately similar. Un- 
fortunately, as we shall see later, the classroom scores 
did not prove to be as good an estimate of the prebelief 
scores as we would have wished. 

The punishment variation was introduced just 
before the trials began. Subjects in the No Shock 
condition were told that there would be no further 
shock and the electrodes were removed. Subjects in the 
Shock condition were told that they would receive a 
shock after every trial for which they pushed the No 
key. They were told: 

One condition under which I am interested in 
Studying ESP is that of shock. I am not sure if 
shock helps, hinders, or changes ESP, but I hope to 
find out. Whenever you push the “No” key you 
will receive an electric shock, which is at the highest 
level that you experienced earlier. When you push 
the “Yes” key you will not receive any shock. 
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However, do not push the “Yes” key just to avoid 
the shock, but only when you are certain that you 
are receiving an impression from the sender. 

In the Shock condition, a 1-second shock, at the 
highest level that the subject had said he could endure, 
was automatically administered immediately after each 
time that the No key was pushed. As mentioned earlier, 
these 12 subjects who pushed the Yes key at any time 
were not considered further in this experiment since 
we were interested only in subjects who did not feel 
that they had received ESP images and who responded 
No even if they did expect to receive punishment for it. 
Seven of these subjects were in the No Shock condition 
and five in the Shock condition. 

Postmeasure of belief. All subjects then went back 
to the room where they had their initial interview and 
were once again asked to rate “extrasensory perception” 
on the AB scale. They also answered a number of other 
questions intended to check on the effectiveness of the 
experimental variables and to gain additional informa- 
tion to help interpret the results. 

At the completion of the postmeasure of belief, 
subjects were asked additional open-ended questions to 
determine whether they were suspicious of the experi- 
mental situation, and were allowed to make comments 
and criticisms of the experiment. Subjects were then 
informed that there was no ESP “sender” and the true 
purposes of the experiment were explained. Comments 
and explanations were invited, after which subjects 
were pledged to secrecy and allowed to leave. 


RESULTS 
Effectiveness of Punishment Variation 


In order to ascertain whether or not the 
shock was in fact painful and did produce con- 
flict, several measures were used. 

A questionnaire item asked: “How much 
discomfort or pain did you feel just before or 
just after you pressed the No button?” The 
subject could check on a 13-point scale ranging 
from 1 (no discomfort) to 13 (unbearable). The 
males and females in the Shock condition gave 
ratings of 7.5 and 8.1, respectively. The cor- 
responding scor. for the No Shock conditions 
were 1.2 and 2.2. These differences between 
conditions were highly significant (p < .001; 
z = 5.75; Mann-Whitney U test), with prac- 
tically no overlap of scores for Shock and No 
Shock conditions. There were no significant 
differences between males and females. 

A further indication of the effect of shock 
might come from a question that asked: 
“Given a situation such as the one which you 
have just experienced, what percentage of 
students such as yourself do you think might 
say that they had received an impression when 
in fact they had not, or say they had not re- 
ceived an impression when in fact they had?” 
Subjects could check a scale ranging from 0 to 
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100%. Subjects who indicated a high per- 
centage were assumed to be likely to have 
themselves considered the possibility of in- 
dicating a Yes response in order to avoid shock. 
The males and females in the Shock condition 
showed mean percentage estimates of 26.4% 
and 26.8%, respectively. For the No Shock 
condition, the corresponding percentages were 
12.7% for males and 15.4% for females. Again 
the differences between conditions were sig- 
nificant (p < .02; s = 2.47; Mann-Whitney 
U test), and the differences between sexes non- 
significant. 

Punishment and Latency of Response 

According to theory, the degree of dis- 
sonance after decision to accept punishment or 
to indicate reception of ESP is a function of 
the conflict before the decision. Conflict could 
be estimated from latency of responses. A 
measure of response latency was recorded by 
the multiple-event recorder—the length of 
time between the cessation of the buzzer sig- 
nals and the Yes or No response by the sub- 
ject. We would expect that the greater the 
latency, the greater the conflict before decision, 
and consequently, the greater the dissonance 
after decision. If our experimental conditions 
were appropriate, we would then expect greater 
latency in the Shock conditions than in the No 
Shock conditions. As can be seen in Table 1 
such differences in latency seemed to occur 
only for females. In the early trials, especially, 
the females in the Shock condition show con- 
siderably greater latency in response than do 
the males in the Shock condition or males and 
females in the No Shock condition. The shock 
does not seem to make much of a difference for 
males, insofar as latency of response is con- 
cerned. We find significant effects of the shock 
variation, sex differences, and a significant 
interaction effect (p = .05; analysis of vari- 
ance). The mean latency of the females in the 
shock condition (.85 seconds) was also sig- 
nificantly greater than that of subjects in the 
other three conditions combined (.32 seconds; 
p < 01). 

One might speculate as to why the males 
showed no increase in latency of response in 
the shock condition, particularly since they 
did indicate that they found the shock as pain- 
ful, on the rating scale, as did the females in 
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TABLE 1 
MEAN CORRECTED LATENCY OF RESPONSE 
IN SECONDS 


Shock 





No Shock 
Female .85 31 


Male 36 .28 


Note.—Mean of three lowest latencies were subtracted from 


mean of three highest latencies for each subject, thus correcting 
for individual differences. In this table, as in others, there are 13 
males and 13 females in each experimental condition. Effects of 
difference of sex, condition, and interaction significant at less than 
05 level by analysis of variance. Females in Shock condition are 
significantly different from all other conditions (p < .01). 


the Shock condition. One reasonable possibility 
is that the males saw the threat of shock asa 
test of their masculinity, and perhaps gained 
some intrinsic reward from rapidly responding 
No. Some anecdotal reports at the end of the 
experiment support this view. Steiner (1960 
and Tuddenham (1959) have found greater 
conformity tendency in females, and an asso- 
ciation of independence with masculinity. In 
terms of the amount of conflict produced by 
the shock, it appears in any case that the males 
in the Shock condition were more similar to 
males and females in the No Shock condition 
than to the females in the Shock condition. 


Pretest Belief Measure 

Before testing whether differences in con- 
flict before decision, and acceptance of punish- 
ment, were related to change in belief after de- 
cision, it was first necessary to establish that 
there were no significant differences in pre- 
test scores. Since classroom test scores were 
used to assign subjects to groups, we might 
have expected no differences in the scores 
taken at the beginning of the experimental ses- 
sion. Unfortunately, our assumption here was 
incorrect (see Table 2). Subjects in the Shock 
condition unaccountably showed higher pre- 
test belief scores than those in the No Shock 
condition. This lack of comparability in the 
pretest complicates the testing of our major 
hypothesis. 


Change in Belief as a Function of Dissonance 


The basic hypothesis predicted that when 
the subject received shock for reporting that he 
did not receive ESP, he would be especially 
likely to change his belief in the direction of 











-—_ -« 2 2 @& 2 


—- ams oa 


— a 


- 


ano) om 





vility 
asa 
‘ined 
ding 
f the 
960 
ater 
iSSO- 
. In 
1 by 
ales 
r to 
tion 


con- 
‘ish- 
 de- 
that 
pre- 
vere 
ight 
ores 
ses- 
was 
ock 
yre- 
ock 
the 


jor 


7en 
he 
lly 

















ACCEPTANCE OF PUNISHMENT AND CHANGE IN BELIEF 415 


TABLE 2 
PRETEST BELIEF SCORES = 


Shock 


No Shock 
Female 23.3 20.7 
Male 24.6 22.7 


Note.—Scores could range from 5 (no belief in existence of 
ESP) to 35 (strong belief in existence of ESP) with a score of 20 
being the midpoint. Difference between conditions was significant 
at less than the .05 level by analysis of variance (F = 4.828; df = 
1/48). Sex differences and interaction were not significant. 


TABLE 3 
MEAN REDUCTION IN BELIEF IN ESP 
FROM PRETEST TO POSTTEST 


Shock No Shock 
Female 7.00 3.71 
Male 3.54 3.68 


Note.—Belief scores could vary from 5 (disbelief) to 35 (strong 
belief), on the AB scales. Over-all analysis of variance and inter- 
action was not significant. Assuming that male shock subjects 
were part of a common population with the nonshock subjects, 
with respect to dissonance, an analysis of variance was conducted 
which showed the female shock subjects to be significantly different 
from the others 7 = 5.39; df = 1/50; p < .05) 


rejecting the existence of ESP. According to 
our original design, this was to be tested by an 
analysis of variance, comparing Shock and No 
Shock conditions on the amount of change in 
belief from pretest to posttest. However, since 
the male Shock subjects did not indicate con- 
flict as a function of the shock, we should not 
have been surprised that the over-all analysis 
of variance did not indicate a significant differ- 
ence (see Table 3). 

The similarity in latency scores of the three 
groups other than the female Shock condition 
(Table 1) would give us some justification in 
combining these three groups in testing our 
dissonance theory hypothesis. It was clear, of 
course, that such an analysis would have to 
be evaluated in light of the fact that it was 
conducted on the basis of data examined after 
the experiment had been completed. In addi- 
tion, it was important to consider the differ- 
ences in pretest scores. Thus we retested our 
hypothesis, using an analysis of covariance, 
which in effect controlled for the differences in 
pretest scores. This analysis (see Table 4) did 
provide support for our major hypothesis. 
Controlling for initial belief in ESP, the female 
subjects who continued to report “no ESP,” 
and who experienced conflict in accepting such 


TABLE 4 
ANALYSIS OF COVARIANCE OF CHANGE CONSIDERING 
PRETEST BELIEF SCORES 
Female All 


shock others 
(N = 13) (N = 39) 


~I 


Mean pretest belief 2 


a 
Mean amount of negative change 2 


nu 


22. 
3.9 


Note.—Analysis of variance on pretest scores significant at 
less than .01 level (F = 64.3; df = 1/50); analysis of variance on 
change significant at less than .05 level (F = 5.39; df = 1/50); 
analysis of covariance for change significant at less than .02 level 
(F = 6.46; df = 1/49). 


TABLE 5 
MEAN ATTITUDE SCORES 


Pretest Posttest Change 
Female, Shock 21.5 19.8 —1.7 
Male, Shock 23.5 22.8 —0.7 
Female, No shock 22.8 20.8 —2.0 
Male, No shock 23.8 21.7 —2.1 


Note.—Scores could range from 5 (indicating great dislike 
for ESP) to 35 (great liking for ESP). Analysis of variance on 
pretest scores and change scores yielded no significant differences. 


punishment, were significantly more likely to 
change their belief in direction of rejecting 
ESP. This effectively reduced dissonance, 
justifying their acceptance of the punishment 
by convincing themselves even more firmly 
that ESP did not exist. 


SUMMARY AND DISCUSSION 


In summary, we predicted from Festinger’s 
(1957) theory of cognitive dissonance that 
subjects who consistently denied receiving 
ESP images would be especially likely to re- 
duce their belief in ESP if their denial of re- 
ception led to shock. Such a reduction in be- 
lief would reduce dissonance between the 
knowledge that they had reported “no recep- 
tion” and the knowledge that they had thus, 
in effect, accepted punishment. The prediction 
was confirmed for female subjects. Our results 
were also consistent for male subjects, if we 
allow for the evidence that males also seemed 
to find the shock less dissonance producing. 

These findings are especially significant 
since they would seem contrary to what we 
might expect from “common sense” or from a 
reinforcement theory. From the latter, we 
might have expected that the shock would 
serve as a negative reinforcer for the response 
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of pushing the “no reception” key, and that 
this would generalize to a “no ESP” response 
on the belief questionnaire. Specifically, we 
would expect that shocking a “no ESP”’ re- 
sponse would then lead the person to indicate 
that he believes even more strongly in ESP, 
i.e., making a “‘yes ESP” response on the ques- 
tionnaire. Yet the opposite was obtained in 
this experiment. Is it then possible that our re- 
sults could be attributed to some generalized 
change in attitude? Perhaps the series of shocks 
increased the negativeness of the entire ex- 
perimental situation, and served as a negative 
reinforcer for any ESP response. Then on a 
questionnaire any response indicating exist- 
ence of ESP or liking for ESP would be 
avoided. Fortunately, our measures allow us 
to ascertain change in attitudes toward ESP 
directly, by using the evaluative dimension 
(A scale) of the AB scales. These scores ranging 
from 5 (indicating great dislike) to 35 (great 
liking for ESP) show no significant change in 
any of the experimental conditions (see Table 
5). This explanation can also, therefore, be re- 
jected. 

Our hypothesis and findings are, however, 
quite consonant with Festinger’s (1957) analy- 
sis of data from the study by Kelman (1953), 
and also complement the study by Festinger 
and Carlsmith (1959). Festinger and Carlsmith 
found that if reward was sufficient to bring 
about compliance, the greater the reward, the 
less the dissonance, and the less beliefs would 
change so as to become consistent with be- 
havior. They would predict the same effect if 
punishment rather than reward had been used 
to bring about complicance. We found that 
when punishment was insufficient to bring 
about compliance, the greater the punish- 
ment, the greater the dissonance, and the more 
beliefs would change so as to become consonant 
with this noncompliance. We would predict the 
same effect if reward had been used rather than 
punishment. Clearly a study is now needed to 
investigate the effects of reward or punish- 
ment at all four levels: very low, moderate and 
just insufficient to bring about compliance, 


moderate and just sufficient to bring about 
compliance, and very high. 
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GSR CONDITIONING AND PERSONALITY FACTORS 
IN ALCOHOLICS AND NORMALS! 


MURIEL D. VOGEL? 
Alcoholism Research Foundation, Toronto 


YSENCK (1957) has suggested that the 
ease with which CRs are acquired and 
retained is dependent upon the person- 

ality factor of introversion-extraversion. 
Franks (1956) has examined this hypothesis 
by comparing eyeblink conditioning of clini- 
cally diagnosed hysterics with that of “dys- 
thymics” (i.e., patients suffering from anxiety, 
reactive depressions, and obsessions or com- 
pulsions). These two groups are identified as 
“neurotic extraverts” and “neurotic intro- 
verts,” respectively, in Eysenck’s personality 
schema. The hysterics were found to acquire 
the conditioned eyeblink more slowly than the 
dysthymics, who conditioned quickly. No 
difference in conditioning was found between 
anormal group and a neurotic group contain- 
ing hysterics and dysthymics. Similar results 
also were obtained in another study (Franks, 
1957) which examined the conditioned eye- 
blink of paid normal subjects in relation to 
personality tests of extraversion; and neu- 
roticism. More extraversive subjects, defined 
in terms of higher test scores on the Maudsley 
Personality Inventory (MPI) acquired the 
CR more slowly and extinguished this response 
more quickly than did less extraversive (i.e., 
more introversive) subjects, but no relation 
between conditioning and neuroticism test 
scores was observed. 

The conditioned galvanic skin reflex (GSR) 
was recorded simultaneously with conditioned 
eyeblink in Franks’ (1956) study. GSR con- 
ditioning was developed more slowly in hys- 
terics than in dysthymics, but the differences 
between the two groups were slight, and ex- 
tinction could not be examined because GSR 
adaptation occurred too rapidly under Franks’ 
testing procedures. He suggested that the 
difference between the groups in GSR con- 


' This paper is adapted from a dissertation submitted 
in partial fulfillment of the requirements for the PhD 
degree at the University of Toronto. 

*The author wishes to express her appreciation to 
A. H. Shephard for his advice in connection with this 
study and to R. H. Walters for assistance in the prep- 
aration of the paper. 
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ditioning might be reduced because of the low 
intensity airpuff which was employed as the 
UCS. He noted that this weak UCS sometimes 
failed to elicit the unconditioned GSR after 
the first few trials. Vogel (1960b), using a bell 
as the UCS, examined GSR conditioning of 
alcoholics in relation to the personality vari- 
ables of extraversion and neuroticism. Al- 
though acquisition and extinction of the con- 
titioned GSR was found to relate as predicted 
to extraversion scores on the MPI, the rate 
of CR acquisition was observed also to 
correlate with neuroticism scores. Vogel’s 
(1960b) observations appear to contradict 
Eysenck’s theory, which holds extraversion 
and neuroticism to be independent with ex- 
traversion as the variable relevant to condi- 
tioning behavior. 

Since there has been no research specifically 
designed to compare alcoholics and nonalco- 
holics in GSR conditioning, the correlations 
reported by Vogel (1960b) between condition- 
ing and personality may apply only to alco- 
holics. This possibility was investigated in the 
present study. GSR conditioning was con- 
ducted with alcoholics and with nonalcoholic 
“normals” (i.e., subjects not hospitalized for 
alcoholism or other psychiatric disorder). 
Because these two groups could be expected to 
differ in neuroticism, and because the author’s 
previous study found alcoholics’ neuroticism 
scores correlated with speed of acquisition, 
covariance analyses were employed to permit 
an examination of GSR conditioning in rela- 
tion to extraversion, statistically free from 
possible influence by differing neuroticism 
scores. The experimental hypothesis, in accord 
with Eysenck’s theory, was formulated as fol- 
lows: 

1. When differences in neuroticism are con- 
trolled in alcoholic and nonalcoholic groups, 
the conditioned GSR is elicited in fewer trials 
and is more resistant to extinction in intro- 
versive than in extraversive subjects. 

In addition to the finding that, among alco- 
holics, extraversion was associated with slower 
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acquisition of the conditioned GSR, Vogel 
(1960b) also noted that some alcoholics failed 
to display the conditioned response within the 
maximum number of trials permitted in her 
procedure. On the assumption that slower 
conditioning is related solely to the extra- 
version variable, the following hypothesis was 
set forth: 

2. Alcoholic and nonalcoholic individuals 
with extraversive test scores more frequently 
fail to display a conditioned GSR than those 
with introversive scores. 

METHOD 
Subjects 

The sample of alcoholics consisted of 48 males, all 
of whom were admitted to the Alcoholism Research 
Foundation Clinic during the time the study was being 
conducted. All male patients in the clinic were tested, 
provided they could read English sufficiently to com- 
plete a questionnaire. Conditioning data could not be 
obtained for eight subjects who failed to condition 
within the maximum number of trials permitted in the 
conditioning procedure. These subjects were considered 
separately for investigation of the second experimental 
hypothesis. 

The control group contained 41 adult male volun- 
teers who were not hospitalized for alcoholism. The 
majority of these subjects were first and second year 
University of Toronto summer school students. Volun- 
teers were obtained also from the Ontario College of 
Education summer school course and from professional 
and technical personnel who visited the building where 
the experiment was being conducted. One subject in 
this group failed to display conditioning in the maxi- 
mum number of acquisition trials allowed and was 
used only in the testing of the second hypothesis. 


Procedure 


All subjects completed the MPI (Eysenck, 1956), 
and the alcoholics also reported on certain drinking 
behavior. The latter data were obtained for another 
study reported elsewhere (Vogel, 1961). 

The short term of hospitalization and the small 
patient population of the clinic made selection of only 
extremely high and low extraversion (E) score subjects 
highly impractical. For this reason, all available subjects 
were employed in this study. E scores on the MPI 
from both university students and alcoholic patients 
at the Alcoholism Research Foundation Clinic had been 
found previously (Vogel, 1960a) not to differ signifi- 
cantly from the mean score of 24.6 reported by Eysenck 
(1956) for normal males. The score of 24.6 was selected, 
therefore, as a measure of central tendency in E scores 
appropriate to samples of both alcoholics and non- 
alcoholics. Subjects were dichotomized into two cate- 
gories—those below and those above 24.6. For 
convenience, these two categories are identified, re- 
spectively, as “introversive” and “extraversive.”’ 


In the alcoholic sample of 40 cases, the mean FE 
score was 20.98 (SD = 9.42) and the mean neuroticism 
(N) score was 35.85 (SD = 11.19). Twenty subjects 
had E scores less than 24.6, and the remaining alco- 
holics had E scores greater than 24.6. The nonalcoholic 
group was also composed of 20 subjects in each of these 
two categories. The mean E score for the 40 nonalco- 
holics was 23.77 (SD = 9.42); the mean N score was 
20.98 (SD = 11.59). 

The GSR conditioning procedure and measures 
have been fully described elsewhere (Vogel, 1960a) 
and may be summarized briefly. Subjects were tested 
individually for conditioning in a semisoundproof 
room. The subject was told that the test was one of 
relaxation or repose and that his task was to spell the 
syllables which were presented to him by a memory 
drum. . 

The CS was a nonsense syllable, Lay, which ap- 
peared 16 times, randomly placed among 35 other low 
association value syllables (Glaze, 1928). The memory 
drum presented a different syllable every 6 seconds, 
The UCS was an unpleasantly loud doorbell buzzer 
which reliably elicited the UCR of abrupt change in 
skin conductance. The first presentation of the CS was 
followed in .5 second by the UCS. A 50% reinforcement 
schedule was employed so that alternate presentations 
of the CS were reinforced. Skin resistance was measured 
by a Lafayette 601-A GSR amplifier, using finger 
clamp electrodes and an Esterline-Angus pen recorder 
which traced the GSR continuously. Every 6 seconds, 
immediately before presentation of a new syllable by 
the memory drum, the GSR amplifier was balanced to 
the subject’s skin resistance at that moment. This 
procedure automatically centered the Esterline-Angus 
pen at O on the record chart and permitted an easy 
comparison of GSR responses to each syllable during 
the conditioning procedure. Accurate identification of 
response to the CS and UCS was permitted by a side 
pen which automatically marked the chart when these 
stimuli were presented. 

The criterion of conditioning was similar to that 
employed by Welch and Kubis (1947a, 1947b), which 
required three consecutive GSR reactions to the CS 
(LAJ) when unaccompanied by the UCS (bell). A GSR 
reaction was defined as a change in skin resistance to 
the CS which was larger than the subject’s largest 
GSR occurring to the intervening buffer syllables. The 
number of reinforcements (i.e., the number of times 
the bell was presented) prior to achievement of this 
criterion constituted the count of trials to acquire the 
CR. A larger score thus indicated slower conditioning. 

Vogel (1960b) noted that this conditioning pro- 
cedure could not satisfactorily be continued much 
longer than 20 minutes because of increasing boredom 
or restlessness on the part of the subject. Slight shifts 
in posture, sighs, or comments aside to the experimenter 
all altered skin conductance somewhat, and under 
such conditions a record of conditioning could not 
reliably be obtained. For this reason, a limit of 15 
minutes was placed on acquisition trials. This duration 
permitted 24 presentations of the CS accompanied by 
UCS and 24 unreinforced presentations. The maximum 
possible score for a subject was thus 24, and if condi- 
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GSR CONDITIONING 


tioning had not then been achieved, the subject was 
grouped with the others who “failed to condition.” 

~ Extinction trials commenced immediately after the 
subject displayed the CR. The UCS was no longer 
presented, and the subject continued to spell syllables 
as they were presented by the memory drum. His 
GSR responses were recorded, and the number of 
CRs to the first 10 unreinforced presentations of the 
CS were counted. A CR was defined as a GSR to the 
CS which was larger than the subject’s largest GSR 
to the intervening neutral syllables. This score in- 
dicated the number of CRs during 10 extinction trials. 
The subject thereafter continued spelling syllables 
until “extinction” occurred. “Extinction” in this case 
was defined as three consecutive presentations of the 
unreinforced CS, all of which failed to elicit a CR. In 
many cases, this had occurred by the time the 10 
standard extinction trials were completed. Two scores 
on extinction were thus obtained for each subject: the 
number of CRs displayed in 10 unreinforced trials, and 
the number of CRs prior to extinction. In both cases, 
a larger score was considered to indicate greater CR 
resistance to extinction. 


RESULTS 


To examine GSR conditioning of alcoholic 
and nonalcoholic subjects in relation to ex- 
traversion, 2 X 2 covariance analyses were per- 
formed.’ This technique was employed to ad- 
just the conditioning scores for any influence 
attributed to neuroticism (N) score differ- 
ences in these two groups. 

The covariance analysis of acquisition scores 
from alcoholic and nonalcoholic subjects is 
summarized in Table 1. After adjusting scores 
for the effect attributable to neuroticism, inter- 
action effects were nonsignificant. Alcoholic 
and nonalcoholic groups do not appear to 
differ in the number of acquisition trials, but a 
significant effect for personality is obtained 
(F = 78.58, df = 1/75, p < .01). Introversive 
subjects in alcoholic and nonalcoholic groups 
displayed the conditioned response in an 
average of 6.12 trials, whereas extraversive 
subjects averaged 13.05 trials before the CR 
was displayed. 

Table 2 summarizes the covariance analysis 
of CRs displayed in 10 extinction trials. No 
significant interaction effect was obtained, and 

*A detailed covariance analysis table for each 
conditioning measure has been deposited with the 
American Documentation Institute. Order Document 
No. 6868 from ADI Auxiliary Publications Project, 
Photoduplication Service, Library of Congress; Wash- 
ington 25, D. C., remitting in advance $1.25 for micro- 
film or $1.25 for photocopies. Make checks payable to: 
Chief, Photoduplication Service, Library of Congress. 
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TABLE 1 

COVARIANCE ANALYSIS OF NUMBER OF TRIALS 
TO ACQUIRE A ConpITIONED GSR witH N 

ScORES OF THE MPI as COVARIANT 





Source | | p 
Group 1 7.9 0.65 | >.10 
Personality | 1 | 950.8 | 78.58 | <.01 
Group X Personality) 1 | 20.7 | 1.71) >.10 
interaction 
Within cells 75 12.1 
TABLE 2 


COVARIANCE ANALYSIS OF NUMBER OF CONDITIONED 
GSRs OBSERVED DURING TEN EXTINCTION 
TRIALS with N ScALE ScoRES OF 
THE MPI as COVARIANT 


n n 
| | 


Source | dj ows | F ? 
es me eee ease, 
Group ; 1 0.7 | 0.21 | >.10 
Personality | 1 | 148.1 | 43.56 | <.01 
Group X Personality) 1 | 6.2 | 1.80 | >.10 

interaction | | 
Within cells | 75 | 3.4 
TABLE 3 


COVARIANCE ANALYSIS OF NUMBER OF CONDITIONED 
GSRs OBSERVED PRIOR TO EXTINCTION 
wito N ScaLe Scores on THE MPI 

AS COVARIANT 


| | 


Source aj Adioet F | p 
| | 
} | 
Group | 41 | 8.15] 0.55] >.10 
Personality | 1 |s72.33 | 38.33 | <.o1 
Group X Personality; 1 6.61 | 0.44) >.10 
interaction 
Within cells 5 | 14.93 | 


| « 
| 4 





no significant difference was observed between 
alcoholic and nonalcoholic subjects. The sig- 
nificant effect for personality (F = 43.56, df 
= 1/75, p < .01) indicates that introversive 
alcoholics and nonalcoholics displayed more 
conditioned responses (mean score of 5.82) 
during 10 extinction trials than did extraver- 
sive subjects in these two groups (mean score 
of 3.20). 

The covariance analysis of number of condi- 
tioned GSRs observed prior to extinction is 
summarized in Table 3. The only significant 
effect observed is that attributable to person- 
ality (F = 38.22, df = 1/75, p < .01). In- 
troversive subjects displayed an average of 
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8.67 CRs prior to extinction, but extraversive 
subjects only averaged 3.47. The evidence 
from the analyses presented in Tables 2 and 3 
both indicate that the conditioned GSR was 
more resistant to extinction in alcoholic and 
nonalcoholic subjects having introversive test 
scores than in those having extraversive scores. 

An examination of personality test scores ob- 
tained in the group of nine subjects who did 
not display conditioning in the maximum num- 
ber of acquisition trials allowed revealed a 
striking agreement with the experimental 
hypothesis. All subjects who “failed to condi- 
tion” had E scores greater than 24.6 and thus 
would be categorized ‘“‘extraversive” in this 
study. 


DISCUSSION 


Introversive alcoholics and nonalcoholics 
displayed a conditioned GSR which was more 
quickly elicited and more resistant to extinc- 
tion than that displayed by extraversive sub- 
jects. Since covariance analyses were per- 
formed in this study, the differences in GSR 
conditioning observed between introversive 
and extraversive subjects are statistically in- 
dependent of differing neuroticism scores. This 
finding is in accord with Franks’s (1956, 1957) 
studies of conditioned eyeblink and may be in- 
terpreted to support Eysenck’s theory which 
asserts that conditionability is dependent upon 
extraversion and is unrelated to neuroticism. 
An explanation for the seemingly contradictory 
observation (Vogel, 1960b) of an association 
between neuroticism and GSR conditioning in 
alcoholics is suggested in Eysenck’s recently 
revised MPI manual. He reports that while 
extraversion and neuroticism scales are almost 
entirely independent, a slight negative correla- 
tion is obtained, particularly if the sample 
contains only subjects with high neuroticism 
scores. Since alcoholics are found to have high 
neuroticism scores, such a correlation could be 
expected in Vogel’s (1960b) sample and may 
explain her observed correlation between 
neuroticism and GSR conditionability. 

The prediction that alcoholic and nonalco- 
holic subjects failing to display a conditioned 
GSR tend to have personality test scores classi- 
fied as extraversive rather than introversive 
was supported by the observation that all nine 
nonconditioned subjects had _ extraversive 
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scores. Since eight of these were alcoholics, 
this finding could suggest that difficulty jn 
GSR conditioning under the procedures of this 
study is more typical of alcoholics than of in. 
dividuals not hospitalized for alcoholism. Ip 
contrast to a mean extraversion score of 29,05 
obtained by the 20 extraversive alcoholics 
who displayed GSR conditioning, the alco. 
holics who did not condition had a mean score 
of 32.38. This might also suggest that failure 
to display GSR conditioning is associated with 
an extreme degree of extraversion, indicated 
by unusually high scores on the MPI. 

Some of the subjects who displayed the 
conditioning pattern verbalized no awareness 
of the systematic presentation of the bell with 
the CS, in many instances identifying the bell 
as a ringing telephone. Although it is almost 
certain that the academic backgrounds of some 
of the nonalcoholics would make them sophis- 
ticated in terms of conditioning techniques, 
these subjects were not observed to condition 
in consistently fewer trials than those who were 
completely naive to this procedure. Speed of 
acquisition and extinction of the CR appeared 
to be unrelated to the subject’s prior knowledge 
of conditioning principles or to his failure to 
associate the occurrence of the bell with the 
CS. This observation is in accord with results 
from other studies of GSR conditioning (Wall 
& Guthrie, 1959) and of conditioning the 
autonomic finger twitch response (Hefferline, 
Keenan, & Harford, 1959), indicating that 
verbal instructions or prior knowledge of the 
required response did not influence the speed 
with which the CR was established or ex- 
tinguished. 


SUMMARY 


Measures of GSR conditioning were ob- 
tained from 40 inpatients of an alcoholism 
clinic and from 40 nonalcoholic males. These 
subjects also completed the Maudsley Person- 
ality Inventory (MPI) which contains an in- 
troversion-extraversion scale. In accord with 
the experimental hypothesis, the CR was found 
to be more quickly acquired and more re- 
sistant to extinction in introversive than in 
extraversive subjects. Alcoholics and nonalco- 
holics did not differ either in mean extraver- 
sion scores or in mean rate of establishing and 
extinguishing the CR. In accord with the 
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hypothesis that more extraversion is associated 
with poorer conditionability, subjects who 
failed to display conditioning within the maxi- 
mum number of training trials allowed were 
found to have extraversive test scores. It was 
noted that these results are consistent with 
sme other eyeblink conditioning studies and 
might be interpreted as lending support to 
Eysenck’s personality theory. 
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vast amount of material is generated in 
A therapeutic interviews, and adequate tech- 
niques for analyzing such material are in demand. 
In this paper, a technique of language analysis is 
applied to examining the changes in the speech of 
a schizophrenic patient over approximately one 
year of his psychotherapy. The study is illustrative 
of a method which has developed from the defini- 
tion of meaning as a matrix of verbal associations 
and of similarity of meaning as the extent of com- 
monality of associations (Bousfield, Cohen, & 
Whitmarsh, 1958). Words which appear within a 
given context in free speech or writing are taken to 
provide the meaning of that context. 

In making a contextual analysis, words are cate- 
gorized on the basis of similarity or synonymity of 
reference in order to reduce the vast number of 
different items in our language which tend to ob- 
scure underlying similarities. The actual analysis 
of a context then involves the tabulation of the 
frequencies of categories which appear in close asso 
ciation with each other. Once a tabulation of the 
categories of associations has been made, it is 
possible to compare the distribution of associates 
in one language segment with that in some other 
language segment. Details of the technique have 
been set forth elsewhere (Laffal, 1959). There are 
94 categories, into no more than 2 of which any 
word may be categorized. In the course of review- 
ing large amounts of verbal material, a dictionary 
of the more common words that are appropriately 
scored under each category has been developed. 
The availability of this dictionary makes the cate- 
gorizing process fairly reliable. The 94 categories 
are shown in Table 1.* The actual application of 


! This report is part of a larger study of language 
distortion in schizophrenia, which is supported by 
United States Public Health Service Grant M-2020. 

This paper has been reviewed in the Veterans Ad- 
ministration and is published with the approval of the 
Chief Medical Director. The statements and conclu- 
sions are the result of the author’s own study and do 
not necessarily reflect opinion or policy of the Veterans 
Administration, 

* The categories listed in Table 1 are not to be 
thought of as the best possible set, or as a final set of 
categories. Each research application of the set of 
categories contributes toward sharpening and defining 
the individual categories, and toward clarifying their 


relative usefulness 
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the categories to a sample of speech is illustrated 
in a selection from the speech of the patient whose 
language is analyzed in this paper. 


reason hollow male reason 
I’m thinking of the cave man. It wouldn’t teach 


hollow hollow 
male reason male reason good 
him a lesson because he didn’t know the virtues 
sex hollow 
join male reason join 


of marriage. He didn’t know how the bonds of 


sex 

join help help 

marriage were implemented to the satisfaction of a 
lead hollow 

sacred male go female female 

deity, so he could go from woman to woman, from 
group group group reason big 
family to family and society thinks that might double 


female difficult female 
the woman’s problem and that they would have to 


open open 
disagree structure 


a different pattern 


possess eat 
not only find food, but find 


possess 


living 


of life. 


The problem posed in the present study was to 
examine the speech of a schizophrenic patient? in 
the course of a year of psychotherapy in which 
certain gross, readily identified and characterized 
behavioral changes took place. At the beginning of 
the treatment the patient was on a locked ward, 
but had occasional privileges to leave the ward in 
his own custody. He was unpredictable in his 
behavior, showed inappropriate affect, laughing 
and smiling apparently in response to his own 
thoughts, and tended to become incomprehensibly 
abstruse in his conversation. Around 6 months 
after the beginning of treatment, the notes in the 
patient’s chart indicate that in the therapy the pa- 
tient was verging closer to significant material 
about the relationship with the therapist, but along 


* The patient, treated by the author, continues to 
be in treatment after 4 years. With occasional relapses 
the patient has made significant social improvement, 
to the point where he is presently attending college 
during the day, while living in the hospital 
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agree 

animal 
*begin® 
*hig® 


"body feature® 


calm 
lean 

*come® 

commodity 

cure 
*durable® 

earth 

easy 

eat 
*essential® 

fast 
*forward® 

free 
*good* 

help 
*hollow® 

hot 

in 

individual 





TABLE 1 


Worp CATEGORIES 


Categories 


*disagree* 
*nature 
end* 
*little 
*body function 
*upset* 
dirty 
go” 
*money* 
*illness* 
change 
sea 
difficult* 
drink 
trivial 
slow* 
*back* 
confine 
bad* 
*hurt 

hill 

cold 
*out® 


*group 


law 
lead 
living® 
male 
*near 
new 
open 
*play 
possess 
real 
reason 
*sacred* 
see* 
Separate 
Sex 
*sharp 
some 
speak 
*strong*® 
*structure* 
true 
up 
young 


CASE REPORT 


crime 
follow 
dead 
“female 
far 
ancient* 
shut 
*work* 
want 
*unreal* 
absurd® 
profane 
hear 
join* 
homosex 
blurred 
all 
*write* 
weak 
ruin 
false 
*down* 
mature® 





Note.—An asterisk preceding a word indicates it was one of 


the 44 categories used in the correlations of the interviews of 
Patient A, Patient B, and Schreber’s writing. An asterisk following 
aword indicates it was one of the 34 categories used in the correla- 
tions of the early, middle, and late interviews of Patient A 


with this were reports of increasingly psychotic 
ways of speaking, and also that the patient was 
hallucinating during the treatment hours. There 
were also reports of the patient shouting at and 
bumping purposely into sicker patients. There 
followed a period in which the patient was re- 
stricted to the ward, and was extremely difficult to 
communicate with, refusing to come to the doctor’s 
office and walking out after a few minutes if he did 
come. At the end of the year the patient was 
communicating better in therapy, talking about 
passes at home, and assumed a privilege card in 
order to work in the library. That this progress was 
not transitory is evidenced by the fact that shortly 
afterward the patient went to an open ward where 
he was elected president of the patient government. 
In the first year of treatment, the patient made a 


clearcut social improvement. 


It seemed reasonable to suppose that with the 
patient’s change toward social adaptation, his lan- 
guage might also show some changes with respect 
to choice and distribution of word categories. This 
could be examined both in terms of the overall 
category changes in interviews taken from various 
periods of the treatment, and by examining in 
detail the shifts in associates accompanying specific 


categories, 


Before actually looking in detail at the way the 
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patient’s language changed, it is worthwhile to see 
whether the technique of language analysis which 
has been described is capable of discriminating 
between the styles of speech or writing of different 
individuals. If the technique is capable of doing 
this, and can also reflect changes within the style 
of one individual, its potential usefulness is obvi- 
ously very broad. Verbal material from three dif- 
ferent sources was therefore analyzed in order to 
explore individual differences in style, in addition 
to examining the productions of the individual 
patient at various points in his therapy to see 
whether his own style had changed with his shift 
toward social adaptation. 


METHOD 


In a prior study (Laffal, 1960) a large amount of 
written material from the autobiography of a psychotic 
patient, had already been categorized using the 94 
categories of the method of contextual associates. A 
number of psychotherapeutic interviews with another 
psychotic patient, Patient B, were also scored into the 
categories; and a sampling from the material of the 
patient whose interviews over a year of therapy were 
to be analyzed, was taken for the purpose of the inter- 
individual comparisons. There were thus available 
separate profiles of the 94 categories from three different 
sources: The written autobiography of the patient 
Schreber, the interview material of Patient A, and the 
interview material of Patient B, Patient A being the 
one whose therapeutic interviews were to be further 
analyzed. 

The method of comparing profiles of categories 
was to correlate them, using Pearson r, with the 
categories substituting for “individuals” and the source 
of the data substituting for “‘variables.’’ Thus Patients 
A, B, and S were the variables, and there were 94 
categories or “individuals” to be compared. The actual 
scores were the frequencies of occurrence of the various 
categories in the speech samples of each of the patients. 

Reliability of profiles is determined by randomly 
splitting the speech sample from each patient, taking 
alternate lines, into two subsamples, and correlating 
the profiles of these subsamples. In addition, the availa- 
bility of two profiles for any patient, makes possible 
four cross-patient correlations for any pair of patients. 
The distribution of these four correlations may be 
used as the basis of tests to determine whether S is 
more like A, than he is like B, or whether A and B 
are more like each other than either of them is like S. 

Despite the fact that the speech samples are cate- 
gorizable into 94 different categories, only 34 of the 
categories are used in the correlations, the remaining 
60 categories being discarded by the application of 
certain systematic procedures for the elimination of 
categories. There are two reasons for this elimination 
procedure. The first has to do with the fact that cate 
gories which occur either with consistently high fre- 
quency or consistently low frequency in all samples of 
speech do not discriminate between the samples, but 
only lead to artifactually high correlations. For example, 
if the patient, in all of his speech about any topic 
whatever, uses certain categories very frequently, and 
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others rarely, these categories would not discriminate 
between his speech about one topic or another, but 
would tend to obscure any differences which might 
exist in his speech about the different topics. The 
second consideration in the elimination of categories is 
the product of a systematic exploration of the changes 
in correlations between profiles of speech samples with 
the elimination of various categories from the correla- 
tions. On the basis of a consistent finding that two 
speech samples made up by taking alternate lines of 
transcribed text from the corpus of speech of one in- 
dividual, correlate very highly with each other as 
compared to their correlations with samples from other 
individuals, a systematic study was made of the 
changes in these self-correlations versus other-correla- 
tions as categories were eliminated from the correla- 
tions. Thus beginning with no eliminations from the 94 
categories, and progressively eliminating more and 
more of the consistently high frequency categories and 
the consistently low frequency categories, in various 
combinations, that arrangement of eliminations was 
sought which would give the greatest difference between 
the self-correlations from a single corpus, versus the 
other-correlations against some other corpus of ma- 
terial. After the categories were ranked from high to 
low based on explicit criteria for such rankings,‘ various 
combinations of eliminations were tried in order to see 
which would give the highest difference between self- 
correlations and other-correlations. The combinations 
of eliminations tried were: for high categories, 0, 4, 
8, 12, 14, and 16; for low categories, 0, 8, 16, 24, 32, 44, 
and 52. The arrangement that seemed to give the 
greatest consistent difference between self-correlations 
and other-correlations, was that in which 16 high and 44 
low categories were eliminated. 


RESULTS AND DISCUSSION 


Tables 2 and 3 show the reliabilities and the 
intercorrelations of speech samples from thera- 


‘In ranking a set of profiles of categories, the 
highest ranked category was taken as that which, for 
all of the profiles being compared, had the highest low 
number. Thus if six profiles were being compared, 
and there were three categories which had frequencies 
as below, the highest ranked category would be Cate- 
gory Number 1, since its lowest number is 12; the 
second highest ranked category would be Category 
Number 2, since its lowest number is 11. 


CATE 

GORY FREQUENCIES IN PROFILE 

NUM 

BER 4 R c D , F 
1 12 17 21 16 13 30 
2 22 19 31 14 11 12 
3 16 21 14 9 13 12 


In ranking the low frequency categories, lowest rank 
was assigned to categories with zero frequencies, then 
with frequencies not exceeding one in any of the vari 
ables, then with frequencies not exceeding two in any 
of the variables, etc. For each set of profiles com 
pared it is necessary to apply this procedure independ 
ently. The 34 categories used in correlating profiles 
are thus somewhat different for different sets of data. 
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TABLE 2 
RELIABILITIES AND INTERCORRELATIONS OF SPEEcR 
SAMPLES FROM PaTIENTS A AND B, anp 
SCHREBER’S AUTOBIOGRAPHY 
(Number of categories correlated = 34) 


S: — .057 
| 


| Patient A Patient B | Schreber 

Ai As Bi B. | Si 
Patient | 
A Ay | 801 
Patient Bi | .099 027 | | 
B B: | .189] 127]  .833 
Schreber Si —.008 | —.031 | —.472 —.308 | 

; —.136 | —.537 | — $10 | 935 


TABLE 3 
RELIABILITIES AND INTERCORRELATIONS OF SPEER 
SAMPLES FROM PATIENTS A AND B anp 
FROM SCHREBER’S AUTOBIOGRAPHY 


(Number of categories correlated = 94) 








Patient A Patient B Sere 
Ai As Bi | B: | Si 

aan —_|_—— 
Patient 
A Ae .932 
Patient B; .829 | .812 
B Be | .845 | .822 | .956 | 
Schreber S; | .690 | .672 | .630 | .634 | 

Se .653 | .652 .592 | .580 | .948 


peutic interviews with Patients A and B, and from 
Schreber’s autobiography, with no categories 
eliminated (94 remaining), and with 60 categories 
eliminated (34 remaining). The 34 remaining cate- 
gories are identified by asterisks before the words, 
in Table 1. There are clear distinctions between 
the speech samples of Patients A and B and of 
Schreber’s autobiography. The general pattern 
shown in Table 3 where no categories were elimi- 
nated, is accentuated in Table 2 where only 34 
categories were used. The profiles of contextual 
associates of Patients A and B are both very dif- 
ferent from that of Schreber, the difference being 
more marked for Patient B. 

Clearly, the technique of analysis of contextual 
associates distinguishes between the styles of 
speech and writing of separate individuals. The 
technique was also applied to the analysis of thera- 
peutic interviews obtained over the course of a 
year of treatment of Patient A. Four interviews 
dated August 27, September 12, September 14, and 
September 17 of 1956; four interviews dated 
February 11, February 20, March 4, and March I! 
of 1957; and five interviews dated October 3, 
October 8, October 10, October 24, and October 31 
of 1957 were taken for the analysis. The selection 
of these particular interviews was determined by 
their relatively close grouping in a series of 30 inter- 
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views which had been recorded during the year of 
treatment. The first four interviews were at the 
beginning of treatment of the patient; the second 
four occurred about 6 months after treatment 
began; and the third set of interviews was about a 
vear after treatment began. The third set con- 
tained five interviews, since two were relatively 
brief. 

The initial hypothesis tested was that the three 
sts of interviews would show a shift in the profiles 
of contextual associates, reflecting improvement 
in the patient’s psychiatric condition. It was pre- 
dicted that the profiles of the last set of interviews 
woild be most different from those of the early 
and the middle sets of interviews, reflecting the 
fact that the last interviews were obtained under 
conditions of relative psychological integration. 
Table 4 shows the reliabilities and the intercorrela- 
tions of the profiles of contextual associates taken 
from the early, middle, and late interviews in the 
psychotherapy of Patient A. The 34 categories 
used in the correlations are identified by asterisks 
after the words in Table 1. 

In Table 4 the reliabilities for the early, middle, 
and late interviews range from .704 to .752. Both 
the early and middle interviews differ radically 
from the late interviews, and are more like each 
other than like the late interviews. 

Since the technique of analysis of contextual 
associates discriminates successfully between the 
styles of separate individuals, and also between the 
productions of one individual in manifestly dif- 
ferent psychological states, it was decided to 
examine in detail the contextual associates of spe- 
cific categories in the early, middle, and late inter- 
views, to determine whether the technique would 
show shifts in profiles associated with the specific 
categories. For this purpose, eight categories were 
selected for examination. The basis of selection of 
these categories was merely that they occurred 
with sufficient frequency in each of the three sets 
of interviews to make possible the development of 
separate profiles for each of the three sets. No 
a priori psychological grounds were used in the 
selection, so that no special significance is attribu- 
ted to the categories selected. The eight categories 
were: join, disagree, possess, go, hollow, agree, 
help, and reason. Contexts of these categories were 
defined as that stretch of speech from three lines 
above a line in which the specific category ap- 
peared, to three lines below the line in which the 
specific category appeared.® In each set of inter- 


*The definition of a context as that stretch of 
speech which, in the typed transcript, begins three 
lines above the line containing a key word, and ends 
three lines after a line containing the key word, was 
developed in a prior study (Laffal, 1960), as a result 
of empirical exploration of stretches of speech of 
diferent lengths. 
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TABLE 4 


CORRELATIONS OF PROFILES OF CONTEXTUAL 
AssociATES DERIVED FROM EARLY, 
MuppLe, AND LATE INTERVIEWS 
or PATIENT A 














CU j= 34) 
Early Middle Late 
Ei E: Mi Me Li 
Early E2 .752 
M, .323 .400 
Middle Me .300 .380 .707 
Late L, |—.033 |—.078 |—.044 |—.189 
Le .156 |—.012 .110 |—.175 . 704 


views, as many such contexts of the specific cate- 
gories were taken as would provide enough scores 
for reliability check. The analysis then proceeded 
as above. There were two profiles of scores available 
for each category, for each set of interviews. The 
categories were ranked from high to low depending 
on distribution of frequencies of occurrence in the 
six profiles. The 16 highest categories and 44 lowest 
categories were eliminated before doing correla- 
tions. 

The distributions of correlations for the eight 
categories are not listed, since the pattern followed 
by the correlations basically repeats that of 
Table 4, with reliabilities being relatively high 
compared to other correlations, and the profiles of 
the last interviews being considerably different 
from those of the early and middle interviews. 

The question raised by the fact that the profiles 
of specific categories seemed to follow the general 
trend of the interviews on the whole, was whether 
contexts of specific categories merely reflected the 
general speech pattern, or whether there was some 
specificity about the profiles related to the specific 
categories. In order to examine this, the contexts 
of different categories from the same sets of inter- 
views were correlated. Thus contexts of join and 
disagree in the early, middle, and late interviews 
were correlated, as were contexts of possess and 
go, hollow and agree, and help and reason. In each 
case, the determination of which categories to 
eliminate was independent and followed the pro- 
cedure described. 

A Mann-Whitney rank test (Mosteller & Bush, 
1954) of the reliabilities of self-correlations of cate- 
gories versus the cross-correlations of different 
categories from the same sets of interviews, shows 
that the reliabilities of self-correlations were sig- 
nificantly higher than the cross-correlations, at the 
.05 level of confidence. The mean of self-correla- 
tions was .521 with a range of .220-.832. The mean 
of the cross-correlations was .372 with a range of 
.147-.557. 

This demonstrates that there is a specificity in 
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the contexts of individual categories which can be 
distinguished from the general drift of the body of 
verbal material out of which the categories are 
drawn. In the present study no further effort was 
made to explore shifts in the contexts of individual 
categories, since no evidence is as yet available as 
to the psychological significance of the individual 
categories. 

The data were examined in one 
order to throw some light on the 
change in the patient’s speech in 
psychotherapy. This further examination was also 
the result of an effort to determine whether the 
differences between Schreber’s writing and the 
speech of the two patients, might not be paralleled 
by differences in relative diversity or constriction 
in the use of the 94 categories. No specific hy- 
potheses were involved in this examination al- 
though it was believed (erroneously, as it turned 
out) that wider use of the 94 categories and more 
uniform distribution of the responses among the 
categories ought to be concomitants of better psy- 


way, in 
the 
the course of 


other 
nature of 


chological integration. 

The diversity of categories used was examined 
by computing entropy or average information 
(Shannon & Weaver, 1949) of the 94 categories 
used by Patients A and B, and by Schreber, and in 
the early, middle, and late interviews of Patient A. 
The formula for the computation of entropy is 
H = —)_/p, loge pi. In the profile of categories, p; 
is the frequency of occurrence of a particular cate 
gory, divided by the total number of occurrences 
of scorable categories. 

The entropy scores® in Tables 5 and 6 may shed 
some light on the nature of Patient A’s changes 
during psychotherapy. In Table 6, Patient A’s 
entropy scores, based on samples drawn from all 
of his interviews, is higher than the entropy scores 
of Patient B, and of Schreber’s writing. It is not 
clear why the percentage of words categorized in 
these samples is lower than those categorized 
throughout the interviews, as shown in Table 5. 
This may be an artifact of the particular segments 
of the interviews that were selected for categoriza- 
tion. The average entropy score of all the inter- 
views of Patient A in Table 5 is 5.6112. This is 
slightly lower than the entropy scores shown for 
Patient A in Table 6, but is still higher than the 
entropy scores for Patient B and for Schreber. 

If the higher entropy score for Patient A in 
Table 6 is a valid indicator of broader and more 
dispersed use of categories by Patient A as com- 
pared to Patient B and to Schreber, this must 
reflect the fact that Patient A throughout the in- 
terview samples selected was considerably more 
disorganized than either Patient B or Schreber in 


*The negative sign has been dropped in these 
scores. 
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TABLE 5 


INFORMATION ScoRES OF Earty, MiIppir, 
AND LATE INTERVIEWS OF PATIENT A 





Percentage of tota!* 
words categorized 


Information scores of 


Interviews , 
1 category profiles 


Early 1 5.7758 .420 
Early 2 5.6257 401 
Early 3 5.5800 389 
Early 4 5.6188 .406 
Middle 1 5.6874 .356 
Middle 2 5.7546 375 
Middle 3 5.6767 388 
Middle 4 5.6642 403 
Late 1 5.5762 .390 
Late 2 5.5742 304 
Late 3 5.3250 .375 
Late 4 5.7087 .373 
Late 5 5.3778 . 380 


* Percentage of total words categorized was obtained by divid 
ing the number of categorizations in an interview, by an estimate 
of the total number of words in the interview. Total number of 
words in the interview was estimated by multiplying the number 
of typed lines in the transcript by 12, the average number of words 


in a line 


TABLE 6 
INFORMATION ScoRES OF A RANDOMLY SELEcTED 
Bopy OF SPEECH FROM INTERVIEWS OF 
PATIENT A, OF PATIENT B, AND 
FROM THE WRITING OF SCHREBER 


Percentage of total* 
words categorized 


Information scores of 
category profiles 


Source of 
materia! 


Patient A 


Sample 1 5.7105 
216 
Sample 2 5.7194 
Patient B 
Sample 1 5.4925 
307 
Sample 2 5.5262 
Schreber 
Sample 1 5.5346 
.272 
Sample 2 5.4679 





® Percentage of total words categorized was obtained by divid- 
ing the total number of categorizations in Samples 1 and 2 of each 
patient, by an estimate of the total number of words used in the 
samples. Total number of words in the sample was estimated by 
multiplying the number of typed lines in the transcripts by 12 
the average number of words in a line 


his writing. Patient A in his early and middle 
interviews especially was speaking in a typically 
schizophrenically obscure fashion. Patient B, while 
paranoid in his ideation, spoke with intact lan- 
guage; and Schreber of course, while wildly delu 
sional, presented his ideas in a well organized and 
logical fashion. The higher entropy scores would 
thus reflect less organization and structure in the 
patient’s use of categorizable words. 

That high entropyscores derived from the profiles 
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{categories do indeed mirror greater disorganiza- 

tion is somewhat supported by close examination 
{Table 5. The Mann-Whitney test applied to the 
ranked entropy scores, shows the entropy scores 
of the late interviews to be significantly smaller at 
the .05 level than the entropy scores of the early 
and middle interviews. In view of the available 
evidence that the last interviews occurred during 
a period of relative integration in Patient A, as 
compared to the early and middle interviews, a 
reasonable conclusion appears to be that psycho- 
logical integration is accompanied by greater 
structuring of the category profile, or by reduction 
f the diversity and dispersion of category choices 
in speech. 

This conclusion suggests, if the present patient 
is typical, that movement in the direction of re- 
covery from schizophrenia is attended by a kind 
of constriction of the language process. Actually 
this fits some of our intuitive notions of recovered 
schizophrenic patients. Very often such patients 
who impress one as imaginative in their psychotic 
states turn out to be rather limited individuals in 
their remissions. Recovery; in a figurative sense, 
is at the expense of broadness; the recovered pa- 
tient no longer has available a variety of possible 
verbal associations which were simple for him in 
his state of illness. 

Thus in applying the technique of analysis of 
contextual associates to interview material from 
the beginning, middle, and latter part of a year of 
treatment of a schizophrenic patient, the essential 
findings are that the patient’s language changes as 
his social behavior changes, and that the nature of 
this language change is in the direction of constric- 
tion and organization. It has also been demon- 
strated that it may be possible to examine in detail 
how the patient’s speech about specific categories, 
and by implication, specific subject matters, 
changes over a period of time. 

The technique of analysis of contextual associ- 
ates is a complex and laborious one. However, the 
material which it undertakes to deal with, verbal 
behavior, is easily among the most complicated 





= 
‘ 


behaviors that psychologists. have attempted to 
study. Others who find the technique of interest, 
may wish to introduce their own variations into it, 
and there is little doubt that further research, in 
revealing the weaknesses and inadequacies of the 
method, will also point toward ways of stream- 
lining it and reducing its cumbersomeness. 


SUMMARY 


A technique of language analysis, called the 
analysis of contextual associates, is applied to 
transcripts of therapeutic interviews taken from a 
year of treatment of a schizophrenic patient. The 
analysis shows changes in the language of the pa- 
tient from a more psychotic period to a more 
socially integrated period. An examination of the 
average information, or entropy, of the patient’s 
speech in the separate stages of his social reintegra- 
tion reveals that the more socially integrated 
period is accompanied by a reduction in informa- 
tion in his speech. The study highlights some of 
the potential of the method of analysis of con- 
textual associates. 
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CRITIQUE AND NOTES 


SHIFTS IN EVALUATIONS OF PARTICIPANTS FOLLOWING INTERGROUP 
COMPETITION' 


WARNER WILSON? anp NORMAN MILLER? 


Northwestern University 


number of hypotheses concerning the effects 
A of interaction and competition between indi- 
viduals and groups can be found in recent litera- 
ture. Interaction between persons tends to increase 
their liking for one another (Bovard, 1951, p. 404; 
Festinger, 1951; Homans, 1950, p. 112; Sherif & 
Sherif, 1956, p. 293); the tendency for interaction 
to result in greater liking tends to be nullified or 
reversed if the interaction is accompanied by un- 
pleasant experiences such as having to associate 
with persons perceived as social inferiors (Festinger 
& Kelley, 1951), failing (Coch & French, 1948; 
French, 1944; Sherif & Sherif, 1956, p. 294), or 
engaging in intragroup competition (Deutsch, 
1953; Grossack, 1954); intergroup competition 
leads to increased ingroup cohesion (Sherif & 
Sherif, 1956, p. 306); intergroup competition leads 
to an unfavorable perception of the outgroup 
(Avigdor, 1952; Sherif & Sherif, 1956; Shrilke, 
1936); and a hostile perception of the outgroup is 
an inevitable concomitant of ingroup cohesion 
(Murdock, 1949, p. 83). While these general theo- 
retical statements are no doubt true of some groups 
under some circumstances, it seems likely that as 
more information is gathered, additions and quali- 
fications will be called for. 

The present paper represents one step toward 
specification of the conditions under which these 
statements apply. Intergroup competition was 
simulated by having two-man teams compete 
against two “stooges” on several tasks. Possible 
differences in competence were eliminated by 
manipulating winning and losing experimentally 
while using equivalent groups of subjects. Winning 
or losing was manipulated by instructing trained 
stooges to consistently win on all tasks against half 
the teams and to consistently lose against the 
other half. The subjects rated their own team 
member as well as the members of the opposing 
team (the stooges), on a list of traits. Ratings were 


! This research was supported in part by United 
States Public Health Project No. M-1544 under the 
direction of Donald T. Campbell. The authors are 
grateful for his encouragement and aid. 

The authors wish to appreciatively acknowledge 
suggestions made by Harold Guetzkow during the 
planning stages of this study. 

2 Now at the University of Hawaii. 

* Now at Yale University. 
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made both before and after intergroup competition 
The shift in favorability of a team member’s before 
and after ratings of his teammate and his Oppo- 
nents was the dependent variable. 


METHOD 


Subjects. The 60 subjects were male students at 
Northwestern University. Half were recruited from 
fraternity houses; the other subjects were students in 
introductory psychology. The subjects drawn from 
introductory psychology received the equivalent of a 
correct answer on their final exam by serving in ap 
experiment. Each two man team of subjects was 
randomly assigned to a Win or Lose condition with 
the restriction that there be 15 teams in each condi- 
tion. 

Stooges. The stooges used to manipulate winning and 
losing were students in advanced undergraduate psy- 
chology classes. They were paid for their time. Persons 
unmistakenly mature in physical appearance were not 
selected. In all, three different pairs of stooges were 
used. 

Procedure. Four persons at a time took part in the 
experiment, two of whom were always stooges. When 
all four had arrived, they were ushered into the exper- 


mental room, shown where to hang their coats, and | 


seated at a round table 5 feet in diameter. A number 
was placed before each subject to identify him. The 
subjects were then given a rating form with 27 per- 
sonality traits: anxious, bossy, capable, complaining, 
conceited, critical, dependent, masculine, greedy, 
gullible, hostile, influential, insincere, obsequious, 
orderly, prying, scheming, secretive, intelligent, stingy, 
strict, stubborn, suspicious, touchy, unappreciative, 
unscrupulous, good judge of personality. Subjects were 
instructed on how to use a nine-point rating scale to 
rate themselves and the other three persons on each of 
the 27 traits. The order in which persons were rated 
was randomly determined. All subjects were provided 
with a definition sheet containing synonyms and short 
behavioral descriptions of all the traits. This definition 
sheet also differentiated between those traits most 
commonly confused by college subjects in a pilot study 
Following the rating task, subjects were instructed as 
follows: 
Now I want to divide you into two teams. Persons 
— and — will form one team, and persons — and 
— the other. I have several sets of problems which 
two persons can work on at once. Team members 
must cooperate for efficient solutions. Both teams 
will work on the same problem at the same time 
The team solving the problem first wins. The spee¢ 
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CRITIQUE AND NOTES 


with which problems of this type are solved cor- 

relates highly with intelligence test performance, 

however, the problems are fairly simple for college 
persons and you should not find them difficult. 

They are representative of a variety of skills and 

we typically find that each team wins about half 

the time. Each time a team wins, that team will 
receive $1.00. As indicated, the problems require 
the participation of both members of each team. 

Therefore, feel free to converse as much as necessary 

with your teammate in coordinating your efforts 

at solution. 

While subjects were apparently assigned to teams 
at random, the two stooges were uniformly assigned 
to the same team. The first task employed a paste- 
board box approximately 18 inches in each dimension 
and webbed with strings producing a complex criss- 
cross pattern. A small cigarette box was placed in the 
bottom of the larger box. Each subject was given a 20 
inch length of stiff straight wire, }¢ inch in diameter, 
and instructed to help his partner lift the cigarette box 
through the strings and out of the larger box without 
touching it with hands or body. The second task re- 
quired the teammates to arrange four flat pieces of 
plastic into a rectangle. The third task was one of the 
socalled “water jar’ problems which is essentially a 
simple mathematical puzzle. The fourth task was a 
crossword puzzle. 

The total period of interaction on the four tasks 
lasted 13 minutes. When stooges had been instructed 
to win, they guided their solution speed against the 
progress of the crucial subjects thereby effectively 
minimizing any differences between the amount of 
time spent by Winning and Losing subjects on each 
of the four tasks. Differences in total solution time 
were controlled by manipulating the time allowed for 
the last task, the crossword puzzle, so as to bring the 
total for every group to 13 minutes. 

Following competition, the experimenter read these 
additional instructions: 

One of the purposes of this experiment is to deter 
mine whether your ability to judge personality 
improves after you have had some experience on 
simple tasks with the persons you are rating. Now 
you are to again rate the other three persons and 
yourselves on the 27 traits. This time you should 
be able to make your ratings more accurate than 
they were last time.‘ 

Treatment of the data. Since all 27 traits had an 
evaluative connotation, the differences between before 
and after ratings on all traits could be combined to 
obtain a composite difference score reflecting an al- 
gebraic sum of the shifts in the favorability of a sub- 
ject’s ratings. The traits capable, masculine, in- 


‘All subjects filled out an open-ended questionnaire 
and forced-choice questionnaire on reasons why they 
might have done better on the tasks than they actually 
did. These data, as well as the self-rating data, are 
omitted from the present report. Similarly, all ratings 
by the stooges are ignored. 

*An analysis was done on each trait separately to 
determine the percentage of judgments which increased 
in favorability after interaction. Most shifts were in a 
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fluential, orderly, intelligent, and good judge of per- 
sonality, were assumed to be positive, all others to be 
negative. The ratings of the positive traits were re- 
versed, a rating of one was changed to nine, two to 
eight, etc., to give all upward shifts in ratings a common 
meaning. 

The individual composite difference scores of team 
members were averaged so that each time a team of 
subjects was tested, a single difference score for their 
team and a single difference score for the opponents 
would be obtained. 

The fact that the stooges and crucial subjects were 
recruited from different populations bears considera- 
tion. Shifts in ratings of ingroup and outgroup members 
may not be comparable given initial differences in 
these rated objects. Averaging the initial ratings so as 
to yield one score for their Ingroup and one for the 
Outgroup by each team of subjects, the differences 
between initial Ingroup and Outgroup ratings made by 
Winning and Losing teams yielded ¢ ratios of .54 and 
.38. However, the possibility still remains that in spite 
of initial random assignment to conditions, those sub- 
jects assigned to one subgroup might make initial 
judgments consistently higher or lower than those 
assigned to another. Subjects were therefore divided 
into four groups according to the source from which 
they were recruited and the condition to which they 
were assigned, Win or Lose, and the initial ratings of 
Ingroup members and Outgroup members were exam- 
ined by separate analyses of variance. The two analyses 
of variance yielded F ratios of 1.41 or less, p > .05. 
Thus, groups recruited from different sources, Winners 
and Losers, and ratings of Ingroup and Outgroup mem- 
bers were considered equivalent in terms of initial 
judgments. 


RESULTS 


Table 1 shows the mean magnitude of the four 
composite difference scores by subgroup and indi- 
cates the significance of the change. In all four 
instances, regardless of the team’s experimental 
treatment or the object rated by the team, there 
is a significant shift toward a more favorable evalu- 
ation following interaction. 

Table 1 also suggests that some before-after 
shifts are larger than others. The results of a 
2 X 2 trend analysis of the composite difference 
scores of teams are presented in Table 2. Since the 
order in which teammates and opponents were 
rated was randomly varied, the trend analysis does 





positive direction. While some traits showed more 
change than others, the percentage of shifts on most 
of the traits is insignificant. The only significant shifts 
in the negative direction occurred when the winners 
rated the losers on the following traits: capable, de- 
pendent, masculine, greedy, and intelligent. While the 
traits “capable” and “intelligent” seem to be task 
relevant traits, there does not seem to be any adequate 
basis for interpreting all five of these traits as defining 
a separate factor. The major analyses are based instead 
on composite scores which lump all the traits together. 





TABLE 1 
MEAN SHIFTS IN FAVORABILITY 


Experimental Mean 


condition Objects rated shift ‘ p 
S_7 Or Teammates | 13.73 | 3.94 | <.01 
Opponents 5.35 | 3.33 | <.01 
SiT Ov Teammates a0 | 2.85 | < 
Opponents 7.62 | 3.56 | <.01 
® S = subject, T = teammate, O = opponent, w = won, / = 
lost 
TABLE 2 
ANALYSIS OF VARIANCE OF SHIFTS IN 
FAVORABILITY 
Source df MS F 
Between treatments (W-L) 1 | 234.03 |2.33 
Between subjects* in same group | 28 | 100.40 
Total between subjects 29 
Between trials (7-O) 1 70.41 (1.37 
Interaction: (W-L) X (T-O) 1 | 579.71 |8.87* 
Pooled subjects X Trials 28 65.36 
Total within subjects 30 


* “Subjects” actually refers to “‘teams”’; each team’s data for 
T and O was pooled to create a single score for each 
*p< Ol 


not evaluate a trial or order effect in the objects 
rated. It merely handles the fact that the ratings 
of teammate and opponent are not independent. 
However, as seen in Table 2, neither main effect is 
significant; only the interaction between the ex- 
perimental treatment (Win or Lose) and object 
rated (Teammates or Opponents) is significant 
(F = 8.87, p < .01).° As seen by inspection of 
Table 1, the significant interaction reflects the 
opposite effects of the experimental treatment on 
magnitude of shift in the ratings. When the crucial 
team won and other team lost, there were large 
positive shifts in ratings of teammates but small 
positive shifts in ratings of opponents. However, 
when the crucial team lost and the other team won, 
ratings of teammates showed small positive shifts 
while ratings of opponents showed somewhat 
larger positive shifts. While there is a suggestion 
that the experimental treatment had a greater 
effect on ratings of teammates as opposed to op- 


Bartlett’s test indicated the composite 
difference scores did not meet the homogeneity of 
variance assumption (x? = 11.37, .01 < » < .02) the 
trend analysis was repeated on transformed scores. 
The transformation X = VX + VX 41 yielded 
homogeneous variances (x7 = 1.20, » > The 
trend analysis on the transformed scores paralleled 
the analysis of Table 2, yielding a significant inter- 
cation but no main effects. 


§ Since 


25). 


CRITIQUE AND NOTES 


ponents, this difference does not reach significance 
(t = 1.68, p > .10). 


DISCUSSION 


Though the design confounds the subject's ex. 
perience with that of his teammate and his oppo. 
nents, it mirrors dyadic competition as it is typi- 
cally found in the natural environment. That one’s 
own team wins when the opponents lose, and that 
whenever one wins one’s teammate does too, are 
the rules rather than exceptions of competitive 
activity. Within these limits, the results suggest 
that the competence of the object rated (Team. 
mate or Opposing team), as reflected by the per- 
formance of that team on the experimental tasks, 
is the major factor accounting for the differentia] 
increase in the favorability ratings made by the 
two experimental groups. 

In addition, the results speak on the several 
hypotheses initially considered. It is quite apparent 
from Table 1 that the first hypothesis, predicting 
an increase in liking as a function of interaction, 
receives support, provided one is willing to define 
an “increase in liking’? by the dependent measure 
herein employed. Hypothesis 2, which is essentially 
a qualification of the first hypothesis, states that 
the tendency for interaction to result in greater 
liking tends to be nullified if the interaction is 
accompanied by unpleasant experiences such as 
losing. While this hypothesis would receive support 
if only the ratings of teammates were inspected, 
there is no main effect in the direction of lower 
ratings by the subjects who lost. However, it may 
reasonably be claimed that the interaction was not 
sufficiently long nor the unpleasantness of losing 
sufficiently intense to warrant an expectation of 
such results in the present instance. 

If the relative increase in the favorability oi 
ratings of teammate versus opposing team is used 
as an index of increased group cohesion, the third 
hypothesis is also in need of qualification. Ratings 
of teammates did not increase more than ratings 
of opponents. The related hypothesis that inter- 
group competition leads to an unfavorable percep- 
tion of the outgroup (Opponents) also fails to find 
support in the present instance. Similarly, pooling 
all subjects, the correlation of .17 between the 
increase in favorability of teammate and opponent 
ratings offers no support for the notion that in- 
group cohesion presupposes a negative perception 
of the outgroup; indeed, the correlation is in the 
wrong direction. 


SUMMARY 


Intergroup competition was simulated by having 
two-man teams compete against two stooges 00 
ye ° . . lated 

several tasks. Winning and losing was manipulated 
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CRITIQUE AND NOTES 


by having the stooges always win against half the 
teams and always lose against the other half. The 
difference in favorability of a team’s before and 
after ratings of the other participants on 27 per- 
sonality traits was the dependent variable. 

A significant interaction was found between the 
experimental treatment (Win or Lose) and the 
object rated (Teammate or Opponent). The rele- 
vance of the findings to five widely held hypotheses 
concerning the effects of interaction and competi- 
tion on individuals and groups was considered. 
Only the hypothesis that “liking” is increased by 
interaction received unqualified support. 
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THE EFFECTS OF THORAZINE ON 


LEARNING AND RETENTION IN 


SCHIZOPHRENIC PATIENTS! 


NORRIS D. 


University of 


ESPITE the enormously expanding use of 
D thorazine and related drugs, little effort has 
been directed towards investigating the effects of 
such agents on basic psychological processes like 
learning, retention, judgment, and abstract reason- 
ing. The claim is often made that these functions 
remain intact or are improved in psychiatric pa- 
tients as a result of thorazine treatment (Lehmann 
& Hanrahan, 1954). These claims have, as a rule, 
been based on clinical impressions and have led 
those making the statement to the further claim 
that the drugs facilitate the process of psycho- 
therapy. If the psychiatric literature can be taken 
as an indication, it seems that one of the major 
arguments for the use of the so-called “tranquil 
izers”’ is that they facilitate psychotherapy (Ford 
& Jameson, 1955; Kinross-Wright, 1955; Winkel 
man, 1954). Lesse (1956) has expressed concern 
regarding the paucity of specific research in this 
area. As Eysenck (1957) suggested, our relative 
lack of knowledge in this area may, perhaps, be due 
to the weak rationales for all of the somatic treat- 
ment procedures. Two recent efforts have been 
made, however, to provide a theoretical basis to 
account for the behavioral changes resulting from 
these procedures (Eysenck, 1957; Heistad, 1957). 

\ number of investigators have been concerned 
with evaluating the effects of thorazine on intelli- 
gence and psychomotor efficiency. Gilgash (1957), 
Kovitz, Carter, and Addison (1955), and Petrie 
and LeBeau (1956) reported improved Wechsler- 
Bellevue test performance in psychotic patients 
after a period of thorazine treatment in dosages 
from 75 to 400 mg. per day. Gibbs, Wilkens, and 
Lauterbach (1957), using two drug dosage levels 
(75-125 and 150-450 mg. per day), found their 
placebo group to demonstrate greater pre- to post- 
test gains in Wechsler-Bellevue IQ than the two 
drug groups. Tourlentes, Hunsicker, and Hurd 
(1958) found thorazine (200 mg. per day) to have 
no significant effect on verbal or nonverbal intelli 
gence, verbal fluency, suggestibility, motor speed, 
or perceptual recognition in schizophrenic patients. 


' This report is based on a dissertation submitted 
to the Graduate School, University of Minnesota, in 
partial fulfillment of the requirements for the degree of 
Doctor of Philosophy. 

The author is grateful to his advisor, William 
Schofield, for his guidance and encouragement. 

?Now at Veterans Administration Hospital, St. 
Cloud, Minnesota. 
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Shatin, Rockmore, and Funk (1956) reported that 
thorazine did not interfere with, and in some cases 
increased, performance on psychomotor tests 
(tracing, dotting, tapping) and on tests of rote 
memory, digit symbol substitution, and word 
fluency. Whitehead and Thune (1958) found thora- 
zine in doses of 200 or 300 mg. to have little effect 
on several functions from simple motor learning to 
problem solving. Porteus (1957) and Porteus and 
Barclay (1957) reported significant decrements in 
maze performance, which they attributed to the 
drug. 

Most of these studies have been concerned with 
the effects of thorazine on intelligence test scores 
or psychomotor and psychological efficiency tests 
In the present investigation, the selection of the 
task was dictated by the intention to find a pro- 
cedure that might be somewhat more pertinent t 
the question of the effect of thorazine on the learn 
ing or relearning which occurs during psycho- 
therapy. 

This study investigated the effects of one of the 
tranquilizing drugs (thorazine) on attempts t 
alter, by selective reinforcement, preferred re 
sponses to stimulus words in_ schizophrenic 
patients. 

METHOD 
Subjects 

Forty schizophrenics, most of whom were hospi- 
talized less than 6 months, were less than 45 years ol 
age, and were average or above in intelligence were 
chosen for the study. Nine subjects were dropped for 
one of a number of reasons and were replaced by other 
subjects who met these criteria. The mean age of this 
group was 35.8 years; the mean length of hospitaliza- 
tion was 4.3 months, and the mean Shipley-Hartiord 
IQ was 106.4. 


Procedure 


The experimental procedures involved the ad 
ministration of a free association test, a specially 
designed learning task, and a test for retention 0! 
learning. 

Free association situation. Thirty-six words arbi- 
trarily chosen from the Kent-Rosanoff list were ad- 
ministered in the usual manner. The subject’s spon 
taneous association to each stimulus word was recorded 

Learning situation. The learning materials, which 
were a multiple-choice form of the ociation tech 
nique, were presented on 4” X 6” index cards The 
stimulus word was typed in capital letters near the 
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top of the card and three response alternatives were 
typed in small letters on a single line below the stimulus 
word. The stimulus words were the same 36 which 
were used in the free association situation. The response 
alternatives for each stimulus word included the sub- 
ject’s preferred (free association) response, which, of 
course, varied for each subject, and two low frequency 
1%) selected from the Russell-Jenkins 
1954 tables. The subject’s preferred response appeared 
with equal frequency in each of the three positions— 
left, middle, and right—on the cards. 

During the learning trials the experimenter said, 
“Right,” following responses the subject was to 
acquire; he said, “Wrong,” for any other choice. The 
response selected by the experimenter as the “correct 
response” was randomized for each stimulus word and 
for each subject and was also controlled for position. 
For 12 of the 36 items, the subject’s preferred response 
was called “correct”; for the remaining 24 items, one 
of the two low frequency alternatives was called 
“correct.” After each trial, the experimenter shuffled 
the deck of cards so that the order of presentation was 
changed for each trial. Once a card was responded to 
correctly on three successive trials, it was removed 
from the deck. Learning trials continued until 18 out 
of 24 “new” associations were learned to a criterion of 
three successive correct responses each. 

Retention test. Retention was tested by presenting 
the learning materials to the subject and having him 
try to recall the correct responses. 


responses 


Treatment of Experimental Groups 


The entire program was 4 weeks in length for each 
subject. All subjects were placed on placebos for 1 week 
prior to the free association test in order to diminish 
the effects of previous drugs. Following the association 
test, subjects were randomly assigned to one of four 
groups. In the following description of these groups, 
the first letter refers to the conditions (thorazine or 
placebo) during the 1-week period between the asso- 
ciation test and the learning trials; the second refers 
to the conditions during the 2-week period between 
the learning trials and the retention test. The four 
groups were as follows: 

TT Group. These subjects learned and were tested 
lor retention under thorazine conditions. 

TP Group. These subjects learned under thorazine, 
but they were tested for retention under placebo con- 
ditions. 

PP Group. These subjects learned and were tested 
for retention under placebo conditions. 

PT Group. These subjects learned under placebo 
conditions, but they were tested for retention under 
thorazine conditions. 

At the time of the learning trials and the retention 
test, 20% of the subjects were on 100 mg. per day of 
thorazine, 17% were on 200 mg., 79> were on 300 mg., 
33% were on 400 mg., and 3°% were on 800 mg. The 
placebos were identical in appearance to thorazine 
tablets. Although the ward staff was aware of the 
drug each patient was receiving, the experimenter was 
hot. 
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TABLE 1 


MEAN NuMBER OF IteEMS RECALLED FOR 
THORAZINE AND PLACEBO LEARNING 
Groups 


PP and PT | TT and TP | 
(N 


N = 20) (N = 20) | 
Items used = t p 
| ays | wrext | 
u | Vari u | Vari 
ance | | ance 


All 24 “‘New” Associa- 


tions 
18 Associations learned 


15.70}14.85 |11.25 |12.30 | 3.82 | .001 
9.19 | 9.05 | 7.21 | 4.53 | 
| | | 


13.15 001 


RESULTS 

The effects of thorazine on learning were evalu- 
ated by comparing the performance of the two 
groups having their learning trials under thorazine 
treatment (TT and TP groups) with the perform- 
ance of the two groups having their learning trials 
under placebo conditions (PP and PT groups). Tho- 
razine, in the dosages used, was found to have no 
significant effect on attempts to alter preferred re- 
sponses to stimulus words. Although the placebo 
learning groups required fewer trials to reach cri- 
terion, the ¢ value obtained in testing the between- 
group difference did not reach an acceptable level 
of significance (p = .18). 

The data for the retention phase were examined 
in several ways. Recall differences between the PP 
and PT groups and between the TT and TP groups 
were studied. Also, the recall of the combined TT 
and PP groups was compared with that of the 
combined TP and PT groups. All ¢ tests of the 
relevant between-group differences were within 
chance expectancies. 

A comparison was made of the effect of thora- 
zine on retention regardless of the learning condi- 
tion, that is, groups TT and PT vs. PP and TP. 
The administration of thorazine during the period 
from the learning trials to the retention test did not 
interfere with the recall of the previously learned 
material. 

When the data were analyzed in such a way 
that the two groups learning under thorazine (TT 
and TP) and the two groups learning under 
placebos (PP and PT) were compared with respect 
to retention, some important findings were revealed. 
As indicated in Table 1, the two groups having 
learning trials under placebos recalled, at the time 
of the retention test, significantly more items than 
the thorazine learning groups. As reported in the 
results of the learning trials, the thorazine learn- 
ing groups took slightly, but not significantly, 
more trials to reach the learning criterion. Even 
though this difference was not significant, it is 
possible that the observed differences in retention 
partially reflect these slight differences in learning. 
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For this reason, it was thought that the findings 
would be more interpretable if it could be demon- 
strated that subjects, matched on the number of 
trials required to learn to criterion, differed with 
respect to the number of items recalled on the re- 
tention test. Of the 20 subjects in each learning 
group (thorazine or placebo), 13 could be matched 
on the number of trials required to reach criterion. 
These 26 matched subjects, 13 from each learning 
condition, were then compared with respect to re- 
call of the “new” associations learned. This opera- 
tion served to reduce only slightly the significance 
of the differences on retention. The two groups 
having learning trials under placebos still recalled 
significantly (.01) more of the 24 “new” associa- 
tions presented and also significantly (.001) more 
of the 18 (of the 24) “new” associations learned to 
criterion. 

DISCUSSION 

The learning task required that the subject 
reject or unlearn a strong response, defined as his 
preferred response, to a stimulus word, learning a 
new response in its place. Thorazine was found to 
have rather complicated effects on learning and 
retention in this task. 

Inferior performance in learning might have 
been expected for the thorazine group if this drug 
has a general depressant effect on the learning 
process. The drug might, however, also bring about 
a reduction in negative transfer from previously 
learned associations which could counteract the 
depressant effect on learning. In this case, one 
might have expected the thorazine learning group 
to be superior on the learning of the 24 “‘new”’ asso- 
ciations and perhaps inferior on the 12 “old” asso- 
ciations. At least, one would expect the drug and 
placebo learning groups to perform differently, 
relative to each other, on these two types of items. 
Such was not the case in the present study. 

The comparison of efficiency of recall under the 
same drug condition as prevailed during learning 
as opposed to recall under a different drug condi- 
tion was, in effect, a test, in the area of human 
verbal learning, of the theoretical propositions of 
Heistad (1957) about the effects of changes in the 
internal environment on response strength. The 
data did not support these hypotheses; the relevant 
between-group differences were statistically insig- 
nificant. 

This study found retention of learning to be 
impaired if the learning occurred under thorazine 
conditions. This finding is of interest because of the 
possible implications for psychiatric treatment pro- 
cedures. It has come to be a general practice to 
engage in psychotherapy with patients who are 
also being treated with chlorpromazine or some 
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similar chemical agent. If the learning (or relearn 
ing) which is accomplished during a course oj 
psychotherapy by patients who are being treated 
concurrently by drugs is less apt to be retained 
than if these patients were not being treated cop. 
currently by chemical means, then one may cer. 
tainly wonder about the efficacy of this general 
practice in producing any durable benefits from 
psychological treatment. 


SUMMARY 


The major concern of this study was to evaluate 
the effects of thorazine on learning and retention 
There is no evidence that the drug, in the dosages 
administered, has any significant effect on attempts 
to alter the preferred responses to stimulus words 
of schizophrenic patients. Retention was not en- 
hanced if the retention test was given under the 
same drug condition as prevailed during learning 
Thorazine, given during the period from the learn. 
ing trials to the test for retention, did not interfere 
with performance on retention. Retention was im- 
paired, however, if the learning was accomplished 
under thorazine conditions. Stated in observa- 
tional terms, learning accomplished under drug 
conditions was not retained as well as learning 
accomplished under placebo conditions. 
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GROUP SIZE, PRIOR EXPERIENCE, AND CONFORMITY! 
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aa present study was a re-examination of 
the relationship between group size and 
conformity. Asch (1952) had reported a curvi- 
linear relationship between group size and con- 
formity; a finding which neither Goldberg (1954) 
nor Kidd (1958) could replicate. These latter two 
studies, however, differed greatly from that of 
Asch. Kidd, unlike Asch, attempted to move his 
subjects’ judgments toward a more correct esti- 
mate. A similar problem existed in Goldberg's re- 
search where the experimental stimuli were much 
too ambiguous for the study to be comparable to 
Asch’s. Since the subjects were not sure of objec- 
tive reality, how could they have perceived any 
discrepancy between their original estimates and 
the supposed partner estimates? The experimental 
procedure employed by Asch, on the other hand, 
presented the subject with a clearly perceived con- 
flict. The influence source’s estimates (the judg- 
ments made by the subject’s supposed partners) 
were clearly in error. The present study was there- 
fore planned again to investigate the effect of the 
group size variable, using procedures, however, 
that are more definitely analogous to Asch’s pro- 
cedure. 

A second purpose of the present study was to 
investigate the effects on conformity of combining 
increasing group size with two prior experience 
variables. Prior experience refers here to a combina- 
tion of high partner prestige and failure experience. 
These two variables have each been found to be 
related to conformity, and have also been found to 
be additive in their effects (Mausner, 1953, 1954a, 
1954b; Mausner & Bloch, 1957). 


METHOD 


The subjects were 227 males enrolled in an intro 
ductory psychology course. The subjects were required 
to judge the lengths of a series of lines. The apparatus 
enabled the subject to choose one of 14 possible 
“lengths” and signal his choice by pressing a momen- 


! Part of a dissertation submitted for the PhD degree. 
This study was supported in part as Clinical Investiga- 
tion Project No. 6-60-01-002, Research and Develop- 
ment Division, Office of The Surgeon General, United 
States Department of the Army. 

The writer would like to express his appreciation 
to L. M. Baker, Kenneth M. Michels, Seymour Fisher, 
and Wendell R. Wilkin for the help they have given on 
various parts of the project. 

2 Now at the Medical Field Service School, Fort Sam 
Houston, Texas. 
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tary contact switch. By the use of lights controlled by 
the experimenter’s master panel, each subject could be 
told that it was his turn to respond and, in addition 
he could be told that his choice was “right”’ or “wrong.” 
The construction and operation of the apparatus was 
the same as that described by Crutchfield (1955). The 
essential point is that the experimenter was able to 
signal the same information to all subjects at the same 
time. 

The experimental procedure required subjects to 
judge the lengths of a series of lines twice. The first 
judgment (A,) was made before the influence attempt 
was introduced. The second judgment (Az) was a re- 
estimation after subjects had observed others making 
their choices. Receiving information as to how other 
members of the group perceived the lengths of the 
lines constituted the influence attempt. The apparatus 
was constructed in such a way that the information that 
each subject received was determined by the experi- 
menter. Subjects were led to believe, however, that 
they were actually being informed of estimates made by 
the other people in the room. The “false’’ information 
was designed so as to be easily perceived as being in- 
correct. The influence attempt involved a substantial 
overestimation of the lengths of the lines by the in- 
fluence source (i.e., the ‘partners’ ” estimates). 

One group served as a control; they made the A; and 
A» judgments without receiving any influence informa- 
tion (no prior experience-Control, or NPEC). Four 
other groups underwent the influence treatment. The 
group sizes were 2, 3, 4, and 5 (NPE-2, -3, -4, and -5). 
Hence, the influence source in these four groups in- 
volved either 1, 2, 3, or 4 partners. An additional five 
groups underwent the same procedures with one excep- 
tion. These groups received the prior experience treat- 
ment before making the A; judgments (PEC, PE-2, -3, 
-4, and -5). On a similar task, they were informed that 
they had done poorly and that the others in the group 
were much more accurate in their estimates. 








8-4 
54 
E a 
2 
oo 
= 
= 24 ere 
oe aWPE 
0- 
2 3 4 : 
GROUP SIZE 


Fic. 1. Mean conformity scores for the eight exper! 
mental groups. 





od 








lled by 
ould be 
Idition, 
vrong.” 
US Was 
5). The 
able to 
e same 


ects to 
he first 
ttempt 
S a re- 
making 
» other 
of the 
varatus 
yn that 
experi- 
r, that 
ade by 
mation 
ing in- 
tantial 
the in- 


Ay and 
forma- 
. Four 
t. The 
nd -5). 
ips in- 
al five 
excep- 
treat- 
5-2, -3, 
d that 
group 


xperi- 














CRITIQUE AND NOTES 

















TABLE 1 
SUMMARY OF THE ANALYSIS OF VARIANCE 
Source of Variance | df MS F 
PE-NPE 1 | 150.15 | 9.66** 
GS 3 62.12 3.99** 
linear 1 57.79 3.40* 
quadratic 1 117.31 7.55°° 
cubic 1 16.53 1.06 
(PE-NPE) X GS 3 14.43 30 
Residual | 152 15.54 
05 <p < .10. 
*p< 01. 
RESULTS 


The conformity score for each subject involved 
computing Az-A, for each trial in which an influ- 
ence attempt had been made. The mean conform- 
ity scores for each experimental group are pre- 
sented in Figure 1. Conformity definitely occurred 
inall groups (NPEC and NPE-2 were significantly 
different from each other; ¢ = 2.04; p < .05). 
There were also definite differences between the 
experimental groups themselves. A summary of 
the analysis of variance is presented in Table 1. 
The analysis indicates that, for all levels of group 
size, the prior experience treatment significantly 
increased the degree of conformity obtained. 
Group size was also found to have a curvilinear 
relationship to conformity (the quadratic compo- 
nent was significant at p < .01). The change from 
a group size of 4 (GS-4) to GS-5 represented a 
decrease in conformity rather than a leveling-off 
eflect (¢ = 2.02; df = 152; p < .025). 

For the formula, Conformity = As-Ai, to be 
used, the A, estimates of all the groups had to be 
approximately equal. This prerequisite was met. 
The mean conformity scores of the two control 
groups were found to be equal, a finding that indi- 
cated that the prior experience treatments in them- 
selves had no definite effect on the re-estimations. 

Interview data indicated that 86% of the total 
sample were aware of the discrepancy between 
their perception of the lines and the influence 
source’s estimates. The data also indicated that 
only 14% of the sample rejected the validity of the 
group situation. There was no relationship between 
this latter attitude and the drop in conformity 
irom GS-4 to GS-5. 
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DISCUSSION 


A major finding of this research was the curvi- 
linear relationship between group size and con- 
formity. This constituted a successful replication 
of Asch’s (1952) finding. The difference between 
the present results and the data reported by Kidd 
(1958) and Goldberg (1954) is apparently due to a 
difference in method. The curvilinear relationship 
between group size and conformity appears when 
one is careful to duplicate an essential aspect of 
Asch’s procedure, namely, the use of partner esti- 
mates that are clearly incorrect. 

The results also indicated that the prior experi- 
ence and group size variables were additive at all 
levels of group size. That is, their combination at 
any particular level produced more conformity 
than that elicited by that level of group size alone. 
The degree of additivity was found to be approxi- 
mately the same for all levels of group size. The 
data, therefore, indicates that a fairly uniform 
relationship exists between these two variables. 
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recent paper by Trehub (1959) introduced 
A the concept of ego disjunction and presented 
an investigation of the relationship between ego 
disjunction and severity of psychopathology. The 
concept of ego disjunction, derived from a double 
approach-avoidance conflict model (Dollard & 
Miller, 1950), assumed that parity or equivalent 
magnitudes of antagonistic need strengths are not 
necessary within a definition of ego disjunction. 
The investigation of the relationship between ego 
disjunction and severity of psychopathology in- 
volved the selection, on rational grounds, of four 
antagonistic pairs of needs (aggression-deference, 
succorance-nurturance, autonomy-abasement, or- 
der-change) from the Edwards (1954) Personal 
Preference Schedule for college students, adoles- 
cents, neurotics, character disorders, and schizo- 
phrenics. The ego disjunction score for each sub- 
ject was computed by summing the T scale scores 
of the two needs which comprise an incompatible 
pair of needs, subtracting 100 (i.e., the sum of the 
two T scale means) from the sum, and summing 
only the positive residuals to produce an overall 
ego disjunction score. 

Two points in this description deserve emphasis: 
The scoring is arbitrary (as indicated by Trehub, 
1959) and neither derives from nor reflects the 
double approach-avoidance model, and the scoring 
procedure produces a confounding of average differ- 
ences and variability differences among groups. 
For example, if the scores of two samples are 
exactly equal, each corresponding to a T scale dis- 
tribution for N = 20, the mean of each group 
equals 50.0, and the average positive residual of 
each group equals 4.0. If, however, we increase 
the variability of one group by multiplying the 
deviation scores by two, the average positive 
residual of this group now equals 8.0 although the 
raw score mean of 50.0 is retained. 

The purpose of this note is to present a reanal- 
ysis of Trehub’s (1959) data to permit a clearer 
evaluation of his theoretical contribution. The data 
were constituted © - each subject by adding the 
scores for the two. ds of a need pair, thus provid- 
ing four scores, and summing to yield an overall 
ego disjunction score. An alternative procedure 
would have been to equate the variance of all 
groups on each need pair according to the average 


1T would like to thank Arnold Trehub for his close 
cooperation and considerable assistance in the prep- 
aration of this paper. 
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variance for each need pair and then apply the 
method of positive residuals. Such a procedure 
would correspond more closely to Trehub’s original 
procedure, which, in turn, derives from the expecta- 
tion that only mutually incompatible needs with a 
relatively high level of joint strength indicate ego 
disjunction. The full scale T scores were used here, 
however, because the alternative procedure js 
extremely tedious, the dividing point for “Tela- 
tively high level of joint strength” is arbitrary, and 
the issue is not at test. Full scale T score data were 
not available for the adolescent subjects. Thus 
Trehub’s positive residual data and our full scale T 
score data were computed for college students, 
neurotics, character disorders, and schizophrenics. 
The mean ego disjunction score per need pair for 
each of the four groups is presented in Table 1. The 
rank order of the groups is consistent with predic- 
tion according to either method. 


TABLE 1 


MEAN Eco DisjuncTIon Scores (per Need Pair) For 
COLLEGE STUDENTs, NEUROTICS, CHARACTER 
DISORDERS, AND SCHIZOPHRENICS 





Positive Full scale T 

residuals scores 
College students 5.01 100.56 
Neurotics 7.37 103.80 
Character disorders 8.65 105.77 
Schizophrenics 11.09 107.27 





The analyses of variance are presented in Table 
2. A Type I analysis (Lindquist, 1953) was done 
for each set of data, utilizing the four groups as a 
between-subjects variable and the four need pairs 
as within-subjects variable. The use of full scale T 
scores confirms the statistically significant main 
effect among groups which was reported by Trehub 
(1959), although at a higher probability level. The 
analyses of the need pairs main effect and the 
Groups X Need Pairs interaction do not yield 
statistically significant differences by either 
method. Thus we can conclude that the prediction 
of a positive relationship between general ego dis- 
junction and degree of psychopathology is con- 
firmed when the confounding effect of variability 
differences is removed. There is, again, no evidence 
that the groups could be characterized with respect 
to particular areas (need pairs) of ego disjunction 

A final analysis utilized the Hartley sequential 
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TABLE 2 
ANALYSES OF VARIANCE OF EGo DisjJuNCTION 
SCORES 
> Positive 
Full scale T scores salina 
Source 
af | MS | F Ms | PF 
Between subjects | 79 | 
Groups | 3 | 672.90 | s.60*| 513.79 | 9.81% 
Error | 76 | 120.19 | $2.39 
Within subjects 240 
Need pairs 3 | 2.23} — | 143.39) 1.46 
Groups’X need pairs | 9 | 312.71 | 1.48 | 102.87 | 1.05 
Error | 228 | 211.35 | 98.22 | 
* Significant at .01 level. 
* Significant at .001 level. 


range test (Hartley, 1955; Snedecor, 1956) to 
evaluate the differences among the four groups 
according to each scoring method. The analysis of 
the positive residual data confirmed Trehub’s 
(1959) analysis (omitting the adolescent group) in 
that all comparisons were statistically significant 
» = .05, two-tailed) except the comparison of the 
neurotics with the character disorders. The anal- 
ysis of the full scale T score data, however, ren- 
dered only a partial confirmation of the previous 


analysis in that the college students were reliably 
discriminable from the schizophrenics and the 
character disorders with no other statistically 
significant comparisons being obtained. Thus we 
can conclude that intergroup comparisons yield 
only a partial categorization of college students as 
distinguished from character disorders and schizo- 
phrenics when the confounding effect of variability 
differences is removed. 
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ARE DIFFERENCES IN SCHIZOPHRENIC SYMPTOMS RELATED 
TO THE MOTHER’S AVOWED ATTITUDES TOWARD CHILD 
REARING? 

WILSON H. GUERTIN 


University of Florida 


, ee existence of a relationship between symp- 
toms and childhood experiences or back- 
ground is a fundamental assumption of most 
theories of psychopathology. Those working with 
schizophrenics look for and report unusual en- 
vironmental or parental features appearing in the 
backgrounds of their patients. This search for a 
relationship between background and _ schizo- 
phrenia presupposes a uniformity in schizophrenic 
symptomatology which clearly does not exist 
(Rabin & King, 1958, p. 227). Other investigators 
try to cross-validate relationships proposed earlier, 
but sample schizophrenics with different symptom 
pictures and conclude that no such relationships 
exist. 

Despite many negative studies, the large num- 
ber of reported differences between the mothers of 
schizophrenics and those of normals (Rabin & 
King, 1958, pp. 260-261) must be explained. The 
differences are either evidence for the existence of 
such a class of relationships or the result of serious 
methodological errors. Differences obtained be- 
tween groups of mothers in most studies cannot be 
attributed conclusively to the psychopathological 
differences between the schizophrenic and normal 
groups since other nonschizophrenic differences 
related to hospitalization and failure in life may be 
operating (Scott, 1958). This is especially true for 
between two 
retrospective 


groups of 
question- 


obtained differences 
mothers’ 
naires, where the mother of a_ schizophrenic’s 
guilt, shame, or feeling of failure may lead to 
important response biases and even deliberate 


responses to 


falsifications. 

Amidst this methodological confusion, a flourish 
of Occam’s razor reduces the issue to the crucial 
general question: Are differences in schizophrenic 
symptoms related to the mother’s avowed attitudes 
toward child rearing? This study will proceed 
de novo, neither proposing nor testing specific sub- 
hypotheses in order to concentrate upon the funda- 
mental question. The weakness of comparing 
social failures (schizophrenics) with social successes 
(normals) will be avoided by relating symptom 
differences within a group of hospitalized schizo- 
phrenics to the avowed attitudes of their mothers. 


PROCEDURE 


The subjects were selected on the basis of the avail 
ability of questionnaire information. They are part of a 
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group of 49 hospitalized male schizophrenics described 
elsewhere in terms of symptom pictures (Guertin, 
1961). In this earlier study, transposed factor analysis 
disclosed five type-factors, and the subject’s loading 
on each of the factors constitutes the independent 
variable in the present investigation. The type-factors 
were labeled normal, catatonic withdrawal vs. over. 
expressive, paranoid, resistive isolation, and anxious- 
dysphoric. The sample is heavily loaded with chronic 
cases despite the imposed upper age limit of 40. (Further 
description of the sample will be found in the original 
article.) 

The mothers’ objectively scored responses to ques- 
tionnaire items provided the dependent variable 
Mothers’ attitudes on family life and children were 
evaluated by the Parental Attitude Research Instr- 
ment (PARI) developed by Schaefer and Bell (1958 
from an earlier scale (Shoben, 1949). Information about 
the subjects and their developmental history and en- | 
vironment was provided by the M-B History Record 
(MBHR) (Briggs, 1959) filled out by the mother or, in 
seven cases, by the father or a sibling. Data were 
gathered entirely by mail, ostensibly to assist in the 
planning of activities for the son in terms of his past 
history. The return rate was 85%. 

Rank-difference correlations were calculated between 
each of the five type-factor loadings for subjects and 
the corresponding mothers’ scores on the 23 scales of 
the PARI and scores on the seven clusters of the 
MBHR (Briggs, 1959). The number of omitted answers 
on the MBHR was correlated with the type-factor 
loadings, increasing the total number of coefficients 
to 155. } 


_ 





— 


™ 


RESULTS AND DISCUSSION 


Only 10 of the 23 PARI scales and 4 of the 7 
MBHR clusters showed correlations with any ol 
the tvpe-factor loadings significantly different from 
zero at less than the 5% level. They are reported 
in Table 1 along with the correlations for number of 
omissions on the MBHR. 

Since the total number of significance tests made ) 
was 155, almost eight values could be expected to 
appear by chance alone to be significant. Sixteen, 
or twice as many as were attributable to chance, 


were actually obtained. 

The uneven distribution of significant (p < .05 
coefficients according to type-factor column in the 
table cannot be attributed to chance (chi square j 
21.49, 4, p < .001). Thus, not only is the number 
of significant coefficients double chance expectanc) 
but they clearly are not distributed by chance 


Some type-factors are more closely related to the 
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CRITIQUE AND NOTES 


TABLE 1 

CORRELATIONS OF SIGNIFICANT PARI AND 
MBHR SvBSCALES WITH SCHIZOPHRENIC 
Type-Factor LOADINGS 


Type-factor 





Scale and subscale oat at 
4 |85| 3/33] 8 
E |ss| 8 |33| 2 
S AB 4 $3 = 
vA U a [4 < 
PARI (N = 28) 
Encourage verbalization 42° 0s 28 25 7 
Seclusion of mother | —23 —08 | —19 | —17 | —45°* 
Martyrdom —09 —16 | —20 00 | —38* 
Marital conflict 39° 21 | —30 | —01 | —03 
Irritability —02 02 | —20 | —22 | —38° 
Suppression of aggres- —25 —12 | —25 | —40*| —23 
sion 
Equalitarianism ss***; 00 | —15 30 ggeee 
Inconsiderate husband 03 17 | —29 | —25 | —38* 
Comradeship & sharing 37° 02 00 35 30 
Dependency mother —10 14 | —19 | —26 | —41° 
MBHR (V = 31) 
Psychopathic —36* —02 | —06 | —19 00 
Achievement —28 —37°* 22 | —26 | —04 
Neurotic 12 —28 12 | —35 46°** 
Schizoid 39° 13 | —18 | —04 17 
Number of Omissions —12 —03 | —07 | —27 | —so*** 
*p < .0S. 
> < .02. 
> < 01. 


questionnaire variables than are the others. That 
the mothers of some types of schizophrenics be- 
have differently on the questionnaires than do the 
nothers of others is therefore demonstrated. 

While it is established that mothers of different 
types of schizophrenics behave differently on the 
questionnaires at the present time, this is only an 
equivocal answer to the ultimate question of 
whether the patients’ family environments differed. 
Gordon (1957) has shown that questionnaire re- 
sponses may be quite unreliable indicators of how 
a mother actually behaves with her child, and Bell 
(1958) discussed the many problems of evaluating 
child rearing behavior in the past from retro- 
spective questionnaires in the present. Nor is the 
MBHR a direct indication of early environment; 
retrospective distortion and deliberate falsification 
probably occur. Also, very recent adjustment is 
weighted rather heavily in the cluster scores of the 
MBHR. While the exact nature of the childhood 
home situation cannot be inferred easily from the 
PARI or MBHR responses, the differences in 
mothers’ current behavior make it reasonable to 
expect that they also behaved differently toward 
their sons in the past. 

From the pattern of significant coefficients in 
Table 1, it is quite surprising to find that the more 
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florid symptom types, catatonic and paranoid, 
are not systematically related to the questionnaire 
scales. Goldstein and Carr (1956) were disap- 
pointed in seeking differences in mothers’ attitudes 
for these two groups. They did find significantly 
more inability to answer items among the mothers 
of the catatonic. While their findings are not con- 
firmed here by the correlations for omissions, the 
experimental procedures really were not compar- 
able. 

The two symptom types of Table 1 that are 
defined best in terms of questionnaire scales are 
the norma! and the anxious-dysphoric. Comparison 
of significant correlations for each type-factor with 
the factors derived by Zuckerman, Ribback, 
Monashkin, and Norton (1958) from the PARI 
shows correspondence with two of their three ob- 
tained factors. The normal type-factor is high for 
Zuckerman’s democratic attitudes factor, while the 
anxious-dysphoric type-factor is high negative for 
authoritarian-control. The democratic attitudes 
factor is composed of items added by Shaefer and 
Bell (1958) so that the average mother could agree 
with more of the items, and they were called “‘rap- 
port” items. As Zuckerman et al. suggest, the fac- 
tor probably is related to an acquiescent response 
set (p. 170). The meaning behind strong professed 
disagreement with authoritarian-control items by 
the mothers of anxious-dysphoric patients is not 
apparent. Yet it is interesting that these least 
typical schizophrenics (pseudoneurotic) have 
mothers who denounce maternal domination, the 
feature which is most often pointed to as schizo- 
phrenogenic (Ekstein, Bryant, & Friedman, 1958, 
564ff.). 

The largest coefficient for the MBHR is for the 
neurotic cluster. Since current adjustment and 
behavior are heavily weighted for that cluster 
score, the high correlation merely reflects the 
1eurotic symptoms of the anxious-dysphoric type- 
factor. The schizoid MBHR cluster is more 
heavily weighted for social isolation than for 
autism or thought disorder, so the normal type- 
factor may be reflected in a high score on the clus- 
ter but in few schizophrenic symptoms. However, 
there is an implication that those schizophrenics 
who make a good present adjustment are charac- 
terized by a history of social isolation. Quite prob- 
ably the chronic undifferentiated schizophrenic in 
good remission scored high on the normal type- 
factor and is responsible for the observed relation- 
ship. 

In conclusion, it appears that there is some 
relationship symptom 
types and the responses of their mothers to ques- 
tionnaires. It is difficult, however, to conclude 
that their actual home situations differed but it 


between schizophrenic 
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seems likely that important home differences 


existed. 
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STATUS RESTORATION AND THE 


REDUCTION OF HOSTILITY' 


PHILIP WORCHEL 


University of Texas 


UMEROUS objections have been raised against 
N the generality of the concept of frustration 
as used in the Yale theory of aggression (Dollard, 
Doob, Miller, Mowrer, & Sears, 1939). Maslow 
(1941) contends that not all deprivations lead to 
aggression but only those which are threatening to 
the security, integrity, or status of the organism. 
Rosenzweig (1944) distinguishes between need 
persitive and ego defensive reactions to frustration. 
Horwitz (1956) postulates that hostility is aroused 
as a consequence of the reduction of a person’s 
power or adequacy. Worchel (1960) similarily con- 
ceives of hostility as arising as a result of “ego 
threat” following the failure or punishment of 
instrumental aggression. It is the purpose of the 
present study to test the implication of a “threat” 
theory of hostility that restoration of status or 
esteem reduces hostility. 


METHOD 


Berkowitz (1958) has suggested some important 
considerations in any adequate test of a catharsis 
hypothesis. Since the present study is concerned 
with the influence of various techniques (other 
than communication and catharsis) on the reduc- 
tion of hostility, the relevant precautions proposed 
by Berkowitz have been taken into account: (a) 
some measure of inhibited aggression, (0) careful 
control of the frustrating variable, and (c) separate 
analysis of the results for subjects who are charac- 
teristically hostile. The TAT is used to obtain 
measures of projective hostility and aggression- 
anxiety. The frustrating situation is identical for 
both the experimental and control subjects, and 
the control task is kept as neutral as possible. The 
results for boys and girls are treated separately 
and the self-ideal discrepancy, which has been 
shown to be related to the expression of hostility 
(Rothaus & Worchel, 1960), is included in the 
factorial design. 


Subjects 


From our introductory classes in psychology, 96 
freshman and sophomore subjects were drawn to 
complete a 2 X 2 X 4 factorial design (Sex X Self- 
Ideal Discrepancy X Treatment). The 48 males and 
females were classified as either high (above 49) or low 
(49 and below) in self-ideal discrepancy on the basis of 


_‘This research was supported by the United States 
Air Force through the Air Force Office of Scientific 
Research of the Air Research and Development Com- 
mand, under Contract No. AF 49(638)-460. 
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their responses to the Self-Activity Inventory (SAT) 
(Worchel, 1957) taken during the first meeting of the 
class. 


Procedure 


The experiment was conducted in small groups of 
16-20 subjects selected on the basis of sex and self- 
ideal discrepancy so that each group had approxi- 
mately the same number of males and females with 
high and low self-ideal discrepancy. To minimize social 
structuring (e.g., friendship and leadership patterns) 
and the effect of initial, differential attitudes towards 
the class instructors, subjects for any one experimental 
session were drawn from all four psychology classes 
with no more than five subjects from any one class. To 
reduce the probability of communication from one 
experimental group to another, sessions were conducted 
on successive hours on two successive days. The sub- 
jects were not informed as to the nature of the experi- 
ment until all groups had been tested. There were four 
different treatments for a complete experiment: success 
(E,), threat removal (E), threat removal plus success 
(E;), and control (C). Biases due to time of testing, 
communication effects, etc. were minimized by repli- 
cating the experiment three times with different orders 
of testing determined by an assistant. The experi- 
menter did not know the experimental treatment to be 
introduced until after the interpolation of the frustra- 
tion. 

At the first meeting of the classes, the SAI was 
administered with the comment that it has been the 
practice of the department of psychology to administer 
a battery of psychological tests for recommendations 
and guidance, and that since the class hour was not 
long enough to complete the entire battery, it would be 
necessary for the students to report during a free hour 
for further testing. Schedules of their free hours were 
filled out so that appointments could be arranged. 

When the small experimental groups met at the 
appointed hour, they were told that the examiner was 
a professor in the department of psychology, that the 
scores on their tests would be kept on file, and that 
they could arrange for appointments to review their 
scores. An “intelligence test’’ consisting of a number 
of subtests was then administered with the comment 
that each test was timed from 30-90 seconds and that 
most students would be able to finish each subtest in 
the prescribed time limits. The first subtest was quite 
easy and most subjects were allowed to finish it without 
any comment from the experimenter. The next four 
subtests were difficult, could not be finished within the 
prescribed time limits, and the experimenter kept 
interrupting with belittling remarks as: “You are too 
slow,” “Speed it up,” “You are taking too long on the 
items,” “Skip around if you can’t do one item,” etc. 
After the fifth subtest, the procedure varied, depending 
upon the nature of the experimental treatment. At this 
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TABLE 1 
MEAN Scores ror Direct AGGRESSION (DA), SELF-AGGRESSION (SA), PROJECTIVE-AGGRESSION (PA). 
AND AGGRESSION-ANXIETY (AA) ON ALL Groups OF SUBJECTS UNDER ALL CONDITIONS 
. Threat removal . 
Threat removal Success end waccess Control 
Male Female Male Female Male Female Maie Female 


LoSI | HiSI | LoSI | HiSI | LoSI | HiSI | LoSI 


DA 47.7 |39.5 |45.0 |47.0 |46.7 (45.0 |39.5 
SA 45.5 41.2 39.5 |45.1 |41.3 |33.3 |39.2 
PA 2.01 2.0) 1.7 | 2.4 | 3.7 | 2.3 | 2.7 
AA nF 1 ae | oe) oO] 3.81 2.91 2.2 


point, the experimenter would open an envelope to 
indicate the treatment to be employed. 

One experimental group (success) continued the 
intelligence test which had five more subtests attached 
to the booklet. For these subtests, the experimenter 
made no further comment during the testing, allowed 
most of them to finish, and praised their work at the 
end of each subtest with such remarks as “You are 
doing much better,” ‘‘That’s good, most of you have 
finished the test.’”” The second experimental group 
(threat removal) had a neutral questionnaire request 
ing some biographical data following the fifth and last 
subtest of the booklet. This questionnaire was arranged 
so that it would take approximately the same amount 
of time as the last five subtests. At the end of the ques 
tionnaire, the experimenter apologized for mistakenly 
giving the wrong time limits on the intelligence test. 
He remarked that he had taker 2 stop-watch with a 
30 second face instead of a 60 secend face, and therefore, 
they had received only one-half of the prescribed time 
for each subtest. The test, therefore, was not valid and 
it would be best to destroy the papers. He asked the 
class to cross out the first five tests. The third experi 
mental group both threat and 
success. At the completion of the first five subtests 
(hostility arousal), the experimenter remarked that he 
These tests would 


was given removal 


had given the wrong time limits, etc 
not count and, therefore, they could destroy them by 
marking through the answers, but for the next five 
subtests the time limits would be correct. As previously, 
praise was given at the end of each subtest and most 
subjec ts were allowed to complete the tests. The fourth 
group, the control session, was given the neutral ques- 
tionnaire to complete at the end of the fifth subtest of 
the intelligence test with no further remarks 

Of course, the procedure does not eliminate entirely 
the possibility of fantasy aggression during the experi 
mental treatment even though all the subjects were kept 
occupied. The same opportunity for such fantasy, 
however, is also possible during the control condition 
Therefore, differences, if they occurred, between experi 
mental and control subjects should not be due to 
differences in fantasy aggression 


De pe ndent Variables 


To get a measure of their hostility following the 


experimental treatment, all subjects were adminis 
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HiSI | LoSI | HiSI | LoSI | HiSI | LoSI | HiSI | LoSI | His] 


40.8 {55.0 |47.2 |44.2 |42.8 |63.3 |57.2 |62.0 (67.7 

50.8 |39.8 47.2 |48.7 (43.0 |39.2 |44.0 |41.0 |49.3 

2612.2 12.3 1 2.2 1 2.01 2.31 2.7 1 2.3133 

2518.8 182 12.9 | oO 1 8.81 2.8 1 3. 1 
TABLE 2 


ANALYSIS OF VARIANCE OF DrREcT AGGRESSION 
Scores FOR ALL GROUPS 


Source df SS MS F 
Treatment 3 5744 | 1915 | 9.82° 
Self-ideal 1 98 OR - 
Sex 1 58 58 
Treatment X Self-ideal 3 R6 29 - 
Treatment X Sex 3 641 214 | 1.1 
Sex X Self-ideal 1 372 372 | 1.91 
Sex X Self-ideal X Treatment 3 68 23 
Within 80 (15632 19 


* Significant beyond the .01 level 


tered a Test of Social Sensitivity and Insight. Instruc- 
tions emphasized the importance of accuracy in judg 
ing others even under conditions of brief contact, and 
in honestly assessing one’s own feelings and attitudes 
towards oneself. The subjects were requested to place 
a check mark in one of the five blanks at the right of 
each statement which best described their opinion from 
‘strongly agree’ to ‘‘strongly disagree.’’ Two scores 
were obtained from this rating scale: hostility toward 
the experimenter (DA) and self-aggression (SA). Hostil 
ity toward the experimenter included agreement with 
those items which devaluated and criticized the expen 
menter or disagreeing with items that described the ex 
perimenter in favorable terms. Self-aggression was 
scored on items where the subject blamed himself for 
poor performance 

In order to obtain a measure of aggression anxiety, 
six slides of the TAT were used. The subjects were told 
that the test measures imagination and the usual direc- 
tions for the TAT were read. Then each slide was pro 
jected for 30 seconds following which the subjects were 
allowed 3-4 minutes to write a story. Two scores were 
obtained from the protocols: personal or projective 
aggression (PA) and impersonal aggression or aggresion 
anxiety (AA). Personal or projective aggression 1s any 
act of aggression from one person to another person oF 
object. The criteria for an act of aggression were those 
used by Clark (1955). An impersonal response aggres- 
sion anxiety) includes any act of aggression toward 4 
person from a source which is vague and clearly not 
from another person—in a sense, an act of God or fate 
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CRITIQUE AND NOTES 


For example, a person is injured by an animal, or 
through an accident or illness. Two judges scored each 
protocol for the presence of a personal and an im- 
sersonal aggression. The interjudge correlation for 
PA was .86 and for AA .73. 


RESULTS AND DISCUSSION 


The means for each group across the three 
replications did not differ significantly and there- 
jore, all subjects under each of the four experi- 
mental conditions were combined. The mean 
aggression scores are shown in Table 1. Only the F 
ratio for the treatment variable on DA was signifi- 
cant beyond the .01 level (Table 2). None of the F 
ratios on any of the other dependent variables 
attained significance. 

On DA, the means for threat removal, success, 
and threat removal plus success were 44.8, 43.0, 
and 47.3, respectively. They did not differ from 
each other significantly but they were all signifi- 
cantly lower than the control group (62.5). The 
fact that there were no differences on SA, PA, and 
\A suggests that the results on direct aggression 
jo not appear to be affected by inhibitory mecha- 
nisms. Therefore, the three techniques for remov- 
ing threat and restoring status seem to be quite 
effective in reducing hostility. 

The results, in general, confirm the importance 
{ power restoration although the subjects in the 
present experiment did not take any action to 
change the experimenter as was true in the experi- 
ment by Horwitz, Goldman, and Lee (1954). The 
design does not permit any direct comparison 
between the techniques of power restoration 
utilized by Horwitz et al. and those in the present 
investigation but our data suggest that reduction 
of hostility by communication which alters the 
frustrating agent or by communication of hostility 
alone to the frustrating agent (Thibaut & Coules, 
1952) may be due to the restoration of esteem in 
these situations. 

It is possible, also, that the effective factor in 
reducing hostility by catharsis may be the experi- 
ence of power or status that follows when one 
“lets off steam” (McClelland & Apicella, 1945; 
Sears, Maccoby, & Levin, 1957). In a study of 
anxiety reduction, McKeachie, Pollie, and Speis- 
man (1955) found that subjects who were encour- 
aged to write comments about their answers to an 
examination made higher scores than students who 
ad conventional answer sheets. Subjects, how- 
ever, who were instructed to write explanations 
made slightly higher scores than those who were 
encouraged to write feelings. The author suggested 
among other things that allowing students to com- 
ment may change their perception of the test from 
ne of punishment to one of facilitating their suc- 
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SUMMARY 


The present study reports positive results on a 
test of the implications of a “threat” theory of 
hostility, namely, that hostility is reduced by 
status restoration. Essentially, the experiment 
deals with techniques designed to restore the 
status or the integrity of the subject, who has been 
subjected to the hostile arousing conditions, with- 
out permitting expression of aggression (catharsis 
or communication). 
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THE EFFECT OF OVERT AGGRESSION ON PHYSIOLOGICAL 
AROUSAL LEVEL! 


JACK E. HOKANSON anp SANFORD SHETLER 


Florida State University 


re question of whether the expression of 
aggression leads to a reduction in aggressive 
drive has been a point of conflict in the literature 
ever since first proposed by Dollard, Doob, Miller, 
Mowrer, and Sears (1939). The central problem in 
the study of this “‘catharsis hypothesis” has been 
the inability of investigators (Berkowitz, 1958) to 
“demonstrate clearly that the decrease in (post- 
aggression) hostile behavior is due to drive reduc- 
tion and not to response inhibition” (p. 274). 

A more general (and presumably more testable) 
question that has recently come under investiga- 
tion concerns the reduction of general arousal level 
after the expression of hostility, Worchel (1957) 
offers indirect evidence that the expression of ag 
gression may reduce performance-interfering emo 
tions (anger and anxiety). Hokanson (1961) found 
a suggestive negative correlation between vigor of 
aggressive response following frustration and post- 
aggression elevation in systolic blood pressure. 
The present experiment attempts to test directly 
the hypothesis that the expression of aggression, 
after a frustration, produces a greater reduction in 
physiological arousal (systolic blood pressure) than 
having no opportunity to aggress. 

In addition, the present research uses both a 
high and a low status frustrator in an attempt to 
assess the effects of this variable on “tension” 
reduction. The hypothesis is advanced that post 
aggression systolic level should be more reduced 
with a low status frustrator than with one of high 


status. 


METHOD 


Fifty-six undergraduate students? taking an intro- 
ductory course in psychology were arranged in a 2 X 
2 X 2 factorial design. These three variables were: 
high and low frustration; high and low status of the 
frustrator (experimenter); opportunity vs. no oppor- 
tunity to aggress against the frustrator following the 
frustration manipulation. 

The experiment was introduced to the subject as one 
involving blood pressure response to working on routine 
intellectual tasks. To aid in establishing this deception, 





'This project was supported by the Research 
Council of Florida State University. 

The authors are also indebted to Michael Burgess 
for his assistance in carrying out this study. 

2 The majority of subjects were female; however, 
owing to limited availability of subjects some males 
were also used. These latter subjects were evenly dis 
tributed throughout the eight cells in the experiment. 


after an initial 8-minute adaptation period, the subject 
was administered the Picture Completion subtest oj 
the WAIS, following which blood pressure was meas. 
ured. After another 2-minute rest period the frustration 
manipulation took place. The subject was asked to 
count backwards from 99 to 1 by two’s as quickly as 
possible. Subjects in the high frustration condition 
were then repeatedly interrupted and harassed con 
cerning their slow performance; asked to begin again 
four times; and finally stopped with the statement that 
their data could not be used. Earlier use of this tech- 
nique (Hokanson, 1959) indicates that subjective 
feelings of anger are markedly increased. Subjects in 
the low frustration condition counted backwards from 
99 to 1 by two’s as quickly as they could ending with 
the experimenter remarking ‘“Good.”’ 

Immediately following the frustration manipula 
tions half of the subjects were given an opportunity 
to aggress physically against the frustrator by ad 
ministering electric shocks* to him. This situation was 
structured as follows: Subject was told that the next 
task involved an interpersonal guessing situation in 
which the subject was to think of a number between ! 
and 10, following which the experimenter was to guess 
the number. If the experimenter’s guess was wrong 
the subject was to signal this error by administering 
the shock. In this way the experimenter was presum- 
ably studying the effect of pain on his subsequent 
guesses. If the experimenter was correct, the subject 
would indicate this verbally. Among all opportunity 
subjects, the experimenter was thus “shocked” at 
least 7 times out of a total of 10 trials. Subjects in the 
no opportunity condition went through the same 
procedure except that the signaling of the experi 
menter’s errors was done by simply flashing a light 
instead of shocking the experimenter. 

The status manipulation was introduced at the 
beginning of the experimental session. The “high 
status” experimenter, a distinguished appearing male 
of 48 years, introduced himself as a visiting professor 
from a large eastern university who was continuing his 
research on blood pressure at Florida State University 
The low status experimenter, an average-appearing 
male student assistant of 20 years, introduced himseli 
as an undergraduate psychology major who was carry 
ing out an experiment for one of his professors. Exten 
sive rehearsal prior to the project insured that each 
experimenter used comparable procedures throughout 
the experimental session. 

Systolic blood pressure was recorded before anc 
after each phase of the experiment, thereby affording 
an index of change as a result of each manipulation 


2 Although the experimenter was wired to an elat 
orate “shock” apparatus, he did not actually receive 
a shock, but merely behaved as if he had 
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CRITIQUE AND NOTES 


RESULTS 

Table 1 presents the analysis of variance of 
cystolic changes resulting from the frustration 
manipulation. The significant main effect for 
rustration indicates that subjects in the high 
frustration condition manifested considerably 
greater systolic increases (11.4 mm. of Hg) than 
subjects in the low frustration treatment (3.9 mm. 
of Hg). 

An analysis of the changes in systolic blood pres- 
sure after the expression of aggression, with respect 
to prefrustration systolic level, is presented in 
Table 2 along with the results of a Duncan multiple 
range test among the eight treatments. This 
analysis is aimed at pointing up any differences 
among conditions in the extent to which systolic 
pressure returned to its pre-arousal level. The sig- 
nificant Status X Opportunity interaction suggests 
that regardless of frustration treatment, subjects 
with the low status experimenter have a trend 
toward higher elevation in blood pressure at the 
end of the experiment under no opportunity to 
aggress condition (6.0 mm. of Hg) than with oppor- 
tunity (1.6 mm. of Hg); whereas under high status 
the trend is reversed, with the “opportunity” 
subjects having a somewhat greater elevation (4.1 
vs. 2.7 mm. of Hg). Comparison of the four cell 
means by Duncan multiple range test reveals no 
significant differences among treatments. These 


trends however, are elaborated further in the 
second order interaction. 
The Status X Frustration X Opportunity 


nteraction is best seen by reference to the cell 
means and the results of the multiple range test 
presented at the bottom of the table. The findings 
suggest that a major part of the variance in this 
interaction can be accounted for by the marked 


TABLE 1 
CHANGE IN SysTOLIC BLoop PRESSURE AS A 
RESULT OF FRUSTRATION 





dé SS MS F 
Status 1 5.8 5.8 
Frustration 1 | 787.5 | 787.5 | 33.2** 
Status X Frustration 1 37.8 37.8 
Within §2 |1231.1 > 
Total 55 |2062.2 


Cell means of systolic changes during frustration 


High 
frustration 


Low 
frustration 


12.57 
10.29 


' 
LOW Status 


High status 
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TABLE 2 
ELEVATION OF SysTOLIC BLoop PRESSURE 
AFTER THE EXPRESSION OF AGGRESSION 
(re prefrustration level) 


df SS MS F 

Status 1 1.8 1.8 
Frustration 1 44.6) 44.6 
Opportunity 1 31.5] 31.5 
Status X Frustration 1 77.8 77.8 
Status X Opportunity 1 | 120.1 | 120.1 | 5.18* 
Frustration X Oppor- 1} 20.7; 20.7 

tunity 
Status X Frustration X 1; 97.7) 97.7 | 4.21° 

Opportunity 
Within 48 1113.1 23.2 
Total 55 |1507.3 


Cell means of systolic changes during the 
expression of aggression 


Low status High status 
Low High Low | High 
frustra- | frustra- frustra- | frustra- 
tion tion tion tion 
Opportunity 1.43," 1.71, 3.7m | 4.32. 
No opportunity 2.00, | 10.00, (=. See ™ 


* Cell means with different subscripts are significantly different 
from one another at the .0S level by Duncan multiple range test. 
. P< os 


elevation in the low status-high frustration-no 
opportunity condition, The systolic elevation (re 
prefrustration level) of 10.0 mm. of Hg in this cell 
is reliably greater (p = .05) than the elevations in 
the remaining seven cells—with no other inter- 
condition differences being significant. 


DISCUSSION 


We have here evidence in support of the view 
that the blocking of aggressive behavior towards a 
frustrator following an anger provocation results in 
an elevated physiological arousal level—but only 
with a frustrator who is perceived as being of ap- 
proximately equal or lower status. Following the 
course uf systolic pressure during the experiment 
for subjects with the low status frustrator: there is 
a mean increase of 12.6 mm. of Hg as a result of the 
frustration manipulation; a return of blood pres- 
sure to within 1.7 mm. of Hg (on the average) of 
prefrustration level for subjects given an oppor- 
tunity to aggress against the experimenter; but a 
negligible drop (2.6 mm.) for subjects given no 
opportunity to express aggression. Both the oppor- 
tunity and no opportunity subjects were faced with 
essentially the same intellectual and physical 
problem during the shock the experimenter portion 
of the procedure (guessing numbers technique) 
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with the exception that the opportunity subjects 
signaled an error by physically hurting the experi- 
menter. Apparently then, the knowledge that the 
experimenter was undergoing discomfort at the 
hands of the subject was the critical factor in the 
reduction of blood pressure. With respect to the 
no opportunity subjects arousal level was increased 
during frustration, and remained high at the con- 
clusion of the experiment in the absence of any 
“relief’’ type of interaction with the frustrator. 

The question arises then as to why there was 
not a similar persistence of elevated arousal in 
the high status-high frustration-no opportunity 
condition. The most parsimonious explanation 
seems to he one involving expectancy constructs. 
Quite possibly, subjects have learned that there is 
very little likelihood of their being able to aggress 
directly against a person of relatively high status; 
and therefore, even though frustrated, response 
tendencies toward overt aggression would probably 
be only minimally aroused. More likely would be 
the evocation of withdrawal responses. Towards 
the close of the experimental session then, tension 
reduction may have taken place at the prospect of 
getting away from the frustrating, high status 
experimenter. 

Closer inspection of the cell means at the bottom 
of Table 2 suggests further that subjects frus- 
trated by the high status experimenter and given 


an opportunity to aggress towards him had a trend 


toward greater post-aggression systolic elevation 
than subjects in the no opportunity condition. 


Quite possibly, being placed in a situation that 
calls for aggression towards a person of relatively 


high status may be associated with a certain 


amount of arousal. 


From the above discussion a tentative generali- 
zation may be drawn: following a frustration (and 
its concomitant elevation of physiological proc- 


esses), “tension” reduction may take place when 


the subject makes a response which he perceives to 
be appropriate to the situation—.e., overt aggres- 
sion towards a frustrator of equal or lower status 
or withdrawal (or some other nonaggressive be- 
havior) with a high status frustrator. Should these 
“appropriate” responses be blocked, we have in 
the present experiment partial evidence at least 
that physiological arousal will remain high 


CRITIQUE AND NOTES 


SUMMARY 


Fifty-six undergraduate male and female syb. 
jects were exposed to the following conditions ip q 
three-way factorial arrangement: high or oy 
frustration by a high or low status experimenter 
with a subsequent opportunity or no opportunity 
to aggress physically (via electric shocks) towards 
the frustrator. Systolic blood pressure was meas. 
ured before and after the frustration manipulation 
and the expression of aggression. 

The main findings in this experiment were: (¢ 
frustration led to significantly greater systolic in. 
creases than the no frustration control condition 
with both the high and the low status experi 
menter; (0) subjects frustrated by the low status 
experimenter and given an opportunity to aggress 
against him manifested a return of blood pressure 
to prefrustration levels which was not significantly 
different from that of nonfrustrated subjects; 
whereas, (c) subjects frustrated by the low status 
experimenter and given no opportunity to aggress 
against him manifested significantly greater 
systolic elevations at the conclusion of the experi- 
ment than either frustrated-opportunity to aggress 
subjects or nonfrustrated subjects; (d) subjects 
frustrated by the high status experimenter mani- 
fested a return of blood pressure to prefrustration 
levels which was not significantly different from 
that of nonfrustrated subjects in both the oppor- 
tunity and no opportunity to aggress conditions. 

The results were discussed as offering support 
for the hypothesis that under certain conditions 
overt aggression has “‘tension’’ reducing qualities; 
but also, that other types of behavior (i.e., with- 
drawal) may reduce arousal if they are appro- 
priate to the situation. 
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TRENDS IN COGNITIVE ABILITY IN THE OLDER AGE RANGES 








sub- ‘ se 
sina A. E. MAXWELL! 
~ low Institute of Psychiatry, Maudsley Hospiial, and University of London 
enter 
unity ve to different notions about how best to covariance matrices for the four age groups (60-64, 
vards D rotate psychological factors into meaningful 65-69, 70-74, and over 75 years) themselves differ, 
meas- positions, factor analysts often come to somewhat for only in so far as they do would one expect to 
ation different conclusions even when they analyse the _ find differences in factor patterns. The test would 
same data. For this reason, it is not to be expected _ be required to be done on the data for raw scores 
: (a that there will be universal agreement about the as the transformation to scaled scores would tend 
ic in- factors of Doppelt and Wallace’s (1955) data for to obliterate differences in variance between the 
lition the Wechsler Adult Intelligence Scale for older groups. However, as all the data required to re- 
(pen: people presented in this paper. However, this need construct the variance-covariance matrices are not 
tatus not be thought of as wholly unsatisfactory if it is available in Doppelt and Wallace’s (1955) paper, 
Bress made clear at the outset what the aim of the we must pass straight to factor analyses of the 
ssure present analysis is. standardized covariance or correlation matrices. 
antly Clinically, it is often customary to use only 10 of 
jects; the subtests of the WAIS battery, and in the Factor Analyses 
tatus interpretation of factors derived from the inter- The original intention was to use Lawley’s 
gress correlations of these tests, it often is the clinical maximum likelihood method of analysis (Thomp- 
eater psychologist’s desire that the first factor be put son, 1951) for all four correlation matrices, so that 
per: } through Vocabulary. The cognitive content of this a statistically valid test of significance of residuals 
sean test, namely, verbal comprehension, is readily ap- would be available. However, difficulties were 
jects praised, and this helps in interpreting the factor. encountered with this method where the data for 
han- But since the Performance tests have considerable the 60-64 and the 65-69 age groups were con- 
ston loadings on a factor anchored in this way, it is well cerned; the communalities and the factor loadings 
a to think of it not simply as a verbal but rather asa for Block Design approached unity, and the 
ypor- verbal-intellectual factor. When an analysis of the method naturally broke down. For these matrices 
- battery yields only two significant factors (Thomp- the first four principal components (Table 1) are 
port son, 1951), it now turns out that Block Design and __ given instead. For the 70-74 group, four significant 
“ws Object Assembly have appreciable positive load- maximum likelihood factors were found, and for 
thes; ings and Vocabulary an appreciative negative the Over 75 group three significant factors. They 
vith- loading, on the second factor when orthogenality too appear in Table 1. 
pro- | —isretained. The latter can then be thought of as a The factors were now rotated so as to maximize 


space performance or visualization factor. These the loading for Vocabulary on the first, and the 
are the two factors which we had primarily in mind Joadings for Block Design and Object Assembly 
when we came to analyse the intercorrelations of on the second factor. The remaining significant 
hos- ) the WAIS tests for Doppelt and Wallace’s (1955) factor for the 70-74 age group, which had now been 
normative data for older people, and the purpose anchored as a result of these rotations, was found 


RER, . ‘ 
sion. of the analysis was to see il there were any clear-cut’ to have high loadings for Picture Completion, 

trends in the contributions of the several tests to Picture Arrangement, and Similarities. As this is 
ety ) these factors after the age of 60. possibly the “second reasoning factor . . . appar- 
hol, } Notions of rotating the factors to “simple ently distinct from that measured by Arithmetic,” 


structure” were not entertained. It is the writer’s poted by Davis (1956), it was decided to rotate the 

pF (Maxwell, 1959) contention that, in our present third and fourth factors for the other three age 
state of ignorance about the standard errors of groups so as to bring it into prominence for them 
lactor loadings, the simple structure concept is too too, The results are given in Table 2. The third 
easily abused. This then, is a comparative study in factor may be labeled education of conceptual 
lactor analysis. Strictly speaking, such a study relations in the light of Davis’s description of it, 
should begin with a test of whether the variance- but it will be discussed further below. 

Loadings for the fourth factor are also given, but 
apart from those for the 70-74 age group, where 
W. L. B. Nixon, who did the factor analyses and in- this factor is statistically significant, the factor 
verted the correlation matrices on the London Uni- eed not be taken too seriously. Though the maxi- 
versity Mercury Computer, and to N. Hemsley who ™um likelihood method of analysis broke down in 
assisted with the other calculations. the case of the two younger age groups, the re- 


_* Thanks are due to Don Kendrick for valuable 
) discussion during the preparation of this paper; to 
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TABLE 1 
UNROTATED Factor LOADINGS 





Factors Factors Factors Factors 
(Age 60-64) (Age 65-69) (Age 70-75) (Age over 7 
Test 
I II Ill IV I II Ill IV I II Ill IV I Il I 
Comprehension 76; 39; 02; 16] 71 |—43 | 03 |—16 | 52 |—18 |—5S7 11 | 77 |-16| 18 
Arithmetic 79 15 12 | 36] 77 |—09 |—30 |—19 | 62 |—40 |—20 |—08 | 78 14 /|-28 } 
Similarities | 76 35 |—21 30 | 73 |—04 29 |—41 | 67 |—24 25 39 | 72 17 | 4 
Digit Span 63 |—31 67 11 | 61 |—48 |—35 19 | 50 |—47 |—24 |—22 | 65 |—21 |—35 
Vocabulary 82 34 10; OS | 8 |—28; O1 |—OS | 72 |—52 |-—19 | OO} 85 |—39/ 10 
Digit Symbol 81 |-—21 05 26 | 80 12 |—23 32 | 71 01 |—34 |—24 | 71 15 |—23 
Picture Completion 75 O08 08 |—33 | 74 10 41 44 | 90 05 12 |—01 | 57 30) Ol 
Block Design 83 |—26 |—10 06 | 76 45 |—22 07 | 46 19 |—82 | 03 | 66 43 |—34 
Picture Arrangement 75 |—22 |—14 |—33 | 80 | 06 44 00 | 68 11 |—21 | 23) 47 53 | 41 
Object Assembly 72 |—36 |—39 11 | 64 60 |—18 |—23 | 37 |—54 |—06 |—37 | 58 | 56 |-11 
Note.—Decimal points omitted . ) 
TABLE 2 
RotTaTeD Factor LOADINGS FOR THE Four AGE Groups 
Factor I Factor II Factor III Factor IV 
Test . P 
60+ | 654+ | 704 | 754 60+ 65+ 704 75+ 60+ 65+ 70+ 75+ 60+ 65+ 70+ } ( 





Comprehension 85 | 83 | 79 | 77 |—04 10 | O1 |—0O2 14 01 07 24 09 |—16 |—08 F 
Arithmetic 78 | 74 | 741 76 11 32 | 14 34 |—42 |—22 06 |—04 01 |—19 16 ‘ 
Similarities 83 | 68 | 66 | 58 11 18 | 27 00 35 36 43 61 01 |—41 |—15 ‘ 
Digit Span 44 | 76 | 72 | 68 13 —07 | 10 31 |-—33 |—39 |—14 |—17 |-—79 19 17 7 
Vocabulary 89 | 90 | 89 | 93 |—02 10 | 10 |—08 |—10 | 04 12 | 07 |—07 |—05| 10 \ 
Digit Sy mbol 65 67 53 59 48 49 48 46 |—33 |—10 09 17 —06 32 40 } 
Picture Completion 72 | 63 | 58 | 40 16 27|19| 33 19 51 60; 39 |\—33 | 44! 51 
Block Design 65 | 50 | 34 42 56 76 | 90 69 |—15 00 02 27 |-15 07 |-—03 ( 
Picture Arrangement 59 | 70 | 41 | 21 52 25 | 42 14 22 53 47 78 |—28 00 10 e 
Object Assembly 50 | 33 | 40 | 29 74 83 | 52 58 |—08 06 |—27 48 08 |—23 | 25 S 
Note.—Decimal points omitted 
— 
rABLE 3 - 
WEIGHTs FOR FINDING Factor SCORES . 
Factor I Factor II Factor IIT ) 
Test — 
60+ 65+ 70+ 75+ 60+ 65+ 70+ 75+ 60+ 65+ 70 75+ 
Comprehension .32 .28 .24 14 |—.34 |—.30 |—.11 |—.13 |—.20 |—.04 |—.04 7) 
Arithmetic .20 16 .16 .16 |—.18 .09 |—.05 .19 |—.59 |—.36 13 20 j u 
Similarities .30 13 15 07 |—.15 |—.08 |—.01 .20 68 36 31 39 \ 
Digit Span — .06 29 17 .12 |—.13 |—.21 06 15 31 |—.52 23 |—.20 P 
Vocabulary 30 .25 46 .60 35 17 |—.19 45 |—.05 |—.04 |—.22 |-—.13 . 
Digit Symbol 01] .08| .03| .06 29 24 08 | .19 51 |—.23 23 |—.03 . 
Picture Completion 16 .08 |_—.08 01 09 .02 .03 07 43 54 88 09 d 
Block Design -.01 |—.05 11 |—.01 .38 49 93 .50 |—.27 14 01 00 r 
Picture Arrangement .00 .11 |—.02 .05 .36 |—.06 .06 |—.07 .48 56 16 48 ‘ 
Object Assembly - .08 13 05 —.05 68 59 .08 29 .29 06 |—.28 | .18 
SD of factor scores 1.00 | 1.00 .94 .96 | 1.00 | 1.00 .95 87 | 1.04 1.00 88 88 . 
ae 
re 
siduals after one factor had been extracted were be taken to be “specific” in nature and result from )  \ 
only just significant for these groups, and though the use of unities in the diagonal cells of the \ 
the principal component factors, which were used matrix. u 
instead, show a few loadings of considerable To facilitate a comparison of the four sets o 
a 


magnitude on the fourth factor, these may safely results, it was deemed essential to obtain regress! 
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TABLE 4 
VARIANCES AND COVARIANCES OF FACTOR 


SCORES 
(60-64) (65-69) 
I Il UI | I II Ill 
| | 
r | 
I . 9989 .0003 |—.0126 | 1.0002 |—.0003 |—.0007 
rH 0002 | . 9987 |—-1144 |—.0003 | 1.0076 |—.0005 
mW —.0126 |—.1144 | 1.0412 |—.0008 |—.0005 | 1.0013 
(70-74) (Over 75) 
_— —| _ - a mepcees 
I II ir | I ; I | W 
I .8876 | .0519 - 0554 . 9233 0351 | .0391 
ll 0520 | .8948 |—.0092 | .0351 | .7613 | .0626 
Il 0554 |—.0092 7801 | .0390 . 062 . 7751 





weights by means of which factor scores are ob- 
tained. These show the relative importance of the 
tests for predicting the several factors. They 
appear in Table 3. In Table 4, the variances and 
covariances of the factor scores are given. Examina- 
tion of the latter table shows that the weights yield 
factors which were virtually orthogonal since the 
ofi-diagonal entries in the matrices are almost 
zero, The standard deviations of the factor scores 
when standardized test scores are used are given 
by the square roots of the entries in the main 
diagonals, and to facilitate comparison, these are 
entered below the weights in Table 3. Where these 
standard deviations fall below unity, the corre 
sponding weights should be scaled up appropri- 
ately, but this can be done mentally when com- 
paring the results for one age group with those for 
another 


DISCUSSION AND CONCLUSIONS 


When looking for trends with age in performance 
on the WAIS battery, it should be remembered 
that the basic data are the means and standard 
deviations (preferably for raw scores, which are 
uncontaminated by scaling) given by Doppelt and 
Wallace (1955). The information supplied by the 
factor loadings, and more especially by the factor 
weights (Table 3), is complementary to the basic 
data and is important primarily for showing the 
roles which the tests play relative to each other— 
that is, how they compete with each other in the 
composition of the factors as people advance in age. 

The loadings for all groups on the first factor are 
relatively high and positive, the highest being for 
Vocabulary, Comprehension, and Similarities. But 
when we come to examine the weights, interesting 
trends become obvious. Neglecting Digit Span and 
the five Performance tests, for which the weights 
are negligible, it is seen that those for Vocabulary 
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steadily increase with age, whereas those for 
Similarities, Comprehension, and (to a lesser 
extent) Arithmetic steadily decrease. Vocabulary 
is a relatively pure measure of verbal comprehen- 
sion, but Comprehension also measures (Davis, 
1956) general reasoning and fluency (that is, 
responsiveness and uninhibitedness). Similarities 
measures, in addition, visualization (the ability to 
manipulate actual and symbolic objects and to see 
relationships), and Arithmetic measures reasoning 
and numerical knowledge. It is logical then to 
conclude that with the onset of old age, verbal 
comprehension plays an ever increasing role in any 
determination of general intelligence, while general 
reasoning, visualization, and fluency play a 
gradually decreasing role. 

The second factor, to which we have attached 
the broad general title space performance, has its 
highest loadings on Block Design, Object As- 
sembly, and Digit Symbol. When we examine the 
factor weights, it is seen that those for Object 
Assembly and, to a lesser degree, Digit Symbol and 
Picture Arrangement, all of which are positive, 
tend to decrease with age, just as the weights for 
Comprehension, which are negative and anti- 
thetical to the former, decrease correspondingly. 
Since the main contrast here is between reasoning, 
in the case of Comprehension, and perceptual 
speed, in the case of the Performance tests, the 
decay of these two types of ability presumably 
explains why the weights for the tests tend to 
decrease and converge to zero as old age sets in. 
Block Design, on the other hand, which contrasts 
with Object Assembly in as far as it is a measure of 
mechanical knowledge, weathers well with age. 
The weights for Arithmetic for the second factor, 
which tend to increase, support this interpretation. 
This suggests that the retention of certain in- 
formation—such as, for example, the multiplica- 
tion table—is little impaired in old age. 

The third factor, which we have tentatively 
called education of conceptual relations, has its 
highest positive loadings for Similarities, Picture 
Arrangement, and Picture Completion, while it 
has considerable negative loadings for Digit Span 
and Arithmetic. Here the weights for Similarities, 
which are positive, and for Arithmetic and Block 
Design, which are negative, fall away steadily 
with age. This appears to be due to the decline in 
perceptual speed and numerical facility. The 
weights for Picture Completion, which are posi- 
tive, increase briskly until just over 70 and then 
suffer an abrupt decline. For Picture Arrangement, 
the weights are more steady. The latter deduc- 
tions, when taken in conjunction with our earlier 
remarks about this factor, suggest that the ability 
to form conceptual relations is fairly well main- 
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tained in old age, but that powers of inductive 
easoning play a decreasing role. 

In summary, it seems fair to say that good per- 
formance on the WAIS battery of tests as old age 
sets in depends to an ever increasing extent on 
verbal comprehension, the command of language 
which a person attains and enjoys during youth 
and middle life. The contribution to performance 
made by inductive and deductive reasoning, per- 
ceptual speed, fluency, and perhaps to a lesser 
extent visualization, gradually declines. 
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ROBLEM solving research has consistently shown 
P men to be superior problem solvers to women 
eg. Maier, 1933). Berry (1958) and 
1953) have shown this superiority even when 
ertain correlated aptitude variables were con- 
trolled for. A number of investigators (Berry, 1958; 
Carey, 1958; Milton, 1957, 1958; Sweeney, 1953) 
have suggested that women’s poor performance is 
jue in part to a negative attitude towards problem 
swlving associated with their “sex role identifica- 
tion.” Carey (1958) attempted to improve these 
ttitudes by having her subjects discuss them in 
increase in the 


Sweeney 


or pope, and accomplished an 
men’s problem solving effectiveness. 

"The present study used Maier’s horse trading 
problem to determine whether problem solving in 
roups might be effective in improving women’s 
Maier and Solem 


problem solving performance. 
of correct 


showed that the proportion 
answers to this problem increased following group 
discussion. This research was designed to answer 
the questions: Do women benefit as much as men 
from solving the problem in a group? Do subjects 
benefit as much when they are in groups of only 
and Is the 
conditions 


105) 
1952 


one sex as in groups of both sexes? 
eflect the same when “majority right” 


are controlled for? 


METHOD 


Three hundred and eighty-eight students, 
primarily freshmen and sophomores, were recruited 
from various courses at the university. Half the sub 
jects were males. Ninety-seven four-person groups 
were established: 32 groups of each single sex composi- 
tion (all male and all female) and 33 groups of mixed 
sex composition (two males and two females). Subjects 
were assigned at random to groups within each sex 
composition category, except that friends and close 
acquaintances were prevented from being in the same 
group. 

Procedure. The 
Maier & Solem, 
with the 


Subjec ts 


standard horse trading problem? 
1952) was typed on a small sheet 
made $30, He 


possible answers listed (He 





‘This investigation was supported by a USPHS 
research grant (M-2704) from the National Institute 
of Mental Health, United States Public Health Service. 

*Half the subjects were asked “How much did he 
make on the two transactions?” instead of the standard 
question, “How much did he make in the horse busi- 
ness?” Distributions of responses to the two questions 
were identical so the results for both questions have 
deen combined for the present analysis. 


‘ 


made $20 [the correct answer], He made $10, He broke 
even, He lost $10). This sheet was placed face down in 
front of each subject. At a signal the subjects had one 
minute to solve the problem individually by checking 
one of the answers supplied on the sheets. These 
answer sheets were collected and a new set passed out 
for recording the subject’s solution following discussion 
with the other members of his (or her) group. Group 
discussion was permitted for 8 minutes. At the end, 
subjects were told to “mark the answer you now 
believe to be correct, regardless of what the rest of 
your group believes.”’ This instruction was intended to 
discourage conformity. 
RESULTS 

As in past studies, a significantly higher propor- 
tion of males (53.1%) than females (25.8%) 
solved the aie correctly individually. Simi- 
larly, the proportion of males (84.8%) having the 
correct answer after group discussion was signifi- 
cantly higher than the proportion of females 
(60.1%). The proportion of males (75.6%) chang- 
ing from an incorrect answer to the correct answer 
after group discussion was significantly higher than 
the proportion of females so changing (51.4%). 
Finally, a slightly, but not significantly, higher 
proportion of females (14.3%) than males (6.9% 
switched from initially correct responses to an 
incorrect answer following discussion. These 
results are summarized in Table 1 

Although these results show a clear superiority 
of males over females in the ability to solve the 
horse trading problem, some surprising differences 
emerged when the data were analyzed according 
to the sex composition of the groups. Table 
compares the proportions of males and females in 
single sex and mixed sex groups achieving correct 
solutions before and after discussion, as well as 
the proportions of those who were initially incor- 
rect who changed to correct solutions after discus- 
sion.’ It can be seen that prior to the discussion, 
males had a significantly higher proportion of 
correct answers than did females in both single-sex 
and mixed-sex groups. It should also be noted, 
however, that a somewhat higher proportion of 


3 The percentages of subjects switching from correct 
to incorrect solutions is so small that we have not 
reported it in subsequent tables. The trend in those 
data supports the main effects discussed in that females 
in all female groups showed the greatest tendency 
(18.5%) to switch from correct to incorrect responses 
following discussion. 
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TABLE 1 TABLE 2 
PERCENTAGES OF CORRECT SOLUTIONS TO PERCENTAGES OF SOLUTIONS BEFORE AND 
THE Horse TRADING PROBLEM BY SEX AFTER Group DiscussION By SEx 
——————— ——s — —S AND SEX COMPOSITION 
} £0 After discussion, ———— 
— a switched to | After 
Sex of answer cewe . : Correct Total correct discussion 
subjects before r~y | ao Type of!group | answer before | answer after switched to | 
| discussion icaiien | Correct | tmccsvest discussion discussion correct } 
| owe answer anewer 
Lone xr Sdinite Fue 
Males | 53.1% | 84.8% | 75.6% | 6.9% Single sex | - } 
| (1904)« | (191) (90) | (4014) Males $2.3 | 85.6 78.3 
** | ** ** (128)* (125) (60) 
Females | 25.8% | 60.1% | 51.4% | 14.3% cg ~— 
Femaies ). ) ° ) Jil. -< + | 
(194) | (193) | (144) | 49) Females ig 49.6 41-0 
- | a, aie | (128) (127) (100) 
ayy : : LATA RE RS Mixed sex 
® Numbers in parentheses are bases for percentages. Four sub- Males | 54.5 83.3 | 70.0 
jects failed to record answers following the discussion: two males . 6 <) , | rn 
Sew. . 16) (66) | (30 
and one female with the correct answer initially and one male with * | \ 
saa acon answer. : Females 33.3 80. 3 75.0 
Difference in percentages between males and females sig- 
oe (66) (66) (44) 
nificant at the .01 level. 
® Numbers in parentheses are bases for percentages. Four 


females in mixed sex groups than in all female 
groups were initially correct (¢ = 1.65, p < .10). 

After group discussion, in single sex groups the 
proportions of males achieving the correct answer 
(85.6%) and switching from incorrect to correct 
solutions (78.3%) were significantly greater than 
the comparable proportions for females (49.6% 
and 41.0%, respectively). In mixed sex groups, on 
the other hand, the proportions of males and 
females who correctly answered the problem after 
discussion (83.3% and 80.3%, respectively) did 
not differ significantly, nor did the proportions of 
males and females (70.0% and 75.0%, respec- 
tively) who switched from incorrect to correct 
solutions. Moreover, these proportions were almost 
the same as the comparable proportions for males 
in the all male groups. In other words, while males 
profited as much from group discussion in mixed 
sex groups as in all male groups, females profited 
much more in mixed sex groups than in all female 
groups. In fact, the females in mixed sex groups 
benefited as much as males in either type of group. 

Is this effect a function of women’s greater 
chance of being exposed to the correct answer in 
mixed sex groups than in all female groups, and 
thus conformity to the majority answer? Or does it 
result from women’s conformity to the right 
answer proposed by male subjects? 

At first glance, the results provide a condi- 
tionally affirmative answer to the first question. 
Due to the higher proportion of correct first solu- 
tions obtained both from males in general and from 
women in the mixed sex groups than from females 
in all female groups, a greater proportion of women 
was exposed to the correct answer in mixed sex 
groups than in all female groups. Over half (55.6%) 
of the women in mixed sex groups as against only 
12.0% of the women in all female groups (p < .01) 


subjects in single sex groups failed to respond following group dis 
cussion: two males and one female with the correct answer and 
one male with an incorrect answer. 

* Difference in percentage between males and females sig 
nificant at the .05 level. 

** Difference in percentage between males and females sig 
nificant at the .01 level. 


were in groups where two or more subjects initially 
held the correct answer. Furthermore, in these 
mixed sex groups with two or more subjects ini- 
tially correct, approximately the same proportions 
of males as females reported correct answers fol- 
lowing discussion (90.5% and 83.3%, respectively 
and changed from incorrect to correct answers 
(81.8% and 79.2%). These results are comparable 
to those obtained from males in all male groups as 
shown in the top half of Table 3. 

The remaining results in Table 3 show, however, 
that the superiority of the women in mixed sex 
groups over all female groups cannot be explained 
entirely in terms of conformity to the majority 
answer. The percentage of women switching from 
an incorrect to the correct answer was not signifi- 
cantly different in mixed sex groups where two or ) 
more subjects initially had the right answer } 
(79.2%) from what it was in mixed sex groups 
where one or no subjects were initially correct | 
(70.0%). By contrast, in groups where one or no 
subjects had the correct answer initially, the pro- 
portion of women switching to the correct answer 
was significantly higher in mixed sex groups ) 
(70.0%) than in all female groups (39.8%), even 
though only a small proportion of both males and 
females was correct before the discussion in both 
single sex and mixed sex groups. Thus regardless ot 
the initial distributions of correct answers in the 
groups, the percentages of women in mixed se 
groups changing to the correct answer were ap 
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TABLE 3 
PERCENTAGES OF SOLUTIONS BEFORE AND AFTER 
Group Discussion BY SEX, SEX COMPOSITION, 
anp INITIAL DISTRIBUTION OF SOLUTIONS 





Cor me et oa t a " 
—— answer | switch 
efore 
dlecussion _ after to correct 
P discussion | answer 
Two or more members 
initially correct: | 
Single sex groups 
Males 67.0 88.2 78.6 
(88)* | (85) | (28) 
Females 50.0 | 66.7 | 50.0 
(24) (24) =| (12) 
Mixed sex groups } 
Males 73.8 90.5 81.8 
42) (42) (11) 
. } 
Females 42.9 83.3 79.2 
(42) | (42) (24) 
One or no member initially } | 
correct: } | 
Single sex groups | 
Males 20.0 | 80.0 | 78.1 
| (40) | (40) | (32) 
| | ee | ee 
Females | 15.4 45.6 | 39.8 
| (104) (103) | (88) 
Mixed sex groups 
Males 20.8 70.8 63.2 
(24) (24) | (9) 
Females 16.7 75.0 70.0 
(24) (24) (20) 


| 
| | 





*Numbers in parentheses are bases for percentages. Of the 
four subjects who failed to respond the second time, referred to in 
Table 2, Footnote a, the three males were in groups with two or 
more subjects correct and the female was in a group with one 
member correct. 

** Difference in percentage between males and females signifi- 
cant at the .01 level. 


proximately the same as the percentages of men 
so changing in both all male and mixed sex groups. 
Females in all female groups, on the other hand, 
were generally much less successful in solving the 
problem. 

The data are insufficieft to give a definitive 
answer to whether the women in mixed sex groups 
conformed to males’ having the correct answer, but 
what evidence is available casts doubt on the 
adequacy of this explanation also. There were 
nine mixed sex groups with one right and three 
Wrong answers to the first administration of the 
problem: five groups in which a male was correct, 
and four in which the female was correct. There 
was no greater tendency for women to be more 
likely to switch to the correct answer following dis- 
cussion where a male had the correct answer than 
where a female had the correct answer. Also, males 
were about as likely to change to the correct 
answer where a female was initially correct as 
where a male was, a finding that provides further 
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evidence that the women in mixed sex groups 
engaged actively in problem solving rather than 
looking to the men for the correct answer. 


DISCUSSION 


The over-all results confirm earlier findings 
which showed the superior ability of males over 
females to solve the horse trading problem. The 
fact that women’s performance in mixed sex groups 
was equal to men’s and superior to women’s in all 
female groups confuses the picture, however. 

The random assignment of subjects to mixed 
and single sex groups, as well as the consistency of 
the results when one or no members were initially 
correct, rules out the likelihood that differences in 
personal attributes (e.g., aptitudes, attitudes) of 
women in the two types of groups account for 
their performance differences. Something in the 
mixed sex situation which differs from the all 
female situation must account for the difference. 

The possibility that women in mixed sex groups 
conformed to the correct answers developed by the 
male members cannot be ruled out. The probability 
that males more often than females discovered the 
correct answer in the mixed sex groups is strongly 
suggested by the significantly higher proportion of 
males as compared to females in single sex groups 
who had the correct answer after the group 
discussion, 

The results may also be explained, however, by 
hypothesizing that women in mixed sex groups 
were more motivated to work on the problem than 
were the women in all female groups.: The men 
may have challenged the women to higher per- 
formance, have encouraged and guided the women 
to think through the solution process, or merely 
have provided a legitimation of the problem solv- 
ing activity. Such an explanation is consistent with 
a possible interpretation of Carey’s (1958) results. 
She interpreted her women’s improved problem 
solving performance to their improved attitude 
toward problem solving following discussion. 
However, she did not find evidence of attitudinal 
improvement. It is possible, then, that the dis- 
cussions of attitudes served to encourage the 
women to work harder in attempting to solve the 
second round of problems, and, since many women 
had the aptitude necessary to solve the problems, 
their average success increased. 

This interpretation may also explain why Maier 
(1933) found that women profited more than men 
in improved problem solving performance from a 
lecture on problem solving. The lecture, in giving 
them an approach to successful problem solving, 
may also have encouraged them to use their 
abilities, while the women in the control groups, 
without such encouragement, approached the 
problems with their usual negative attitudes. 








456 


Thus despite women’s generally negative atti- 
tudes toward problem solving, circumstances may 
be manipulated to motivate them momentarily to 
actively and successfully solve problems. While 
Milton (1957) and Carey (1958) have suggested 
that women’s sex role identification produces 
negative attitudes toward problem solving, the 
present results suggest that under certain condi- 
tions these negative attitudes may be overcome 
and women can adopt a positive approach to 
solving problems. Whether the negative attitudes 
are held in abeyance (to be reinvoked upon leaving 
the experimental situation) or other facets of the 
women’s personal identification are evoked in the 
presence of men remains an unanswered question. 
It is possible that women’s sex role identification 
may be used to motivate their problem solving 
activity. Milton’s (1958) more recent study showed 
that problems phrased in a “feminine content 
form” were solved approximately equally well by 
males and females and his interpretation of the 
results is consistent with the present one. In just 
what other ways sex role identification may be 
turned to productive problem solving activity in 
women would seem to be a subject for further 
research. 


SUMMARY 


One hundred ninety-four men and an equal 
number of women solved Maier’s horse trading 
problem, first as individuals and then in groups of 
four. There were 32 all male, 32 all female, and 
33 mixed sex groups (two males and two females). 

The results were consistent with 
earlier findings: 

1. 53.1% of the males as against 25.8% of the 
females solved the horse trading problem cor- 
rectly individually (p < .01). 

2. 84.8% oi the males as against 60.1% of the 
females had the correct answer following group 
discussion (p < .O1). 

3. 75.6% of the males as against 51.4% of the 
females switched from an incorrect to the correct 
answer following group discussion (p < .01). 

Confusing the sex difference literature is the 


following 
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finding that the percentage of women in mixed se 
groups switching from an incorrect to a correc: 
answer following group discussion was: (a) ap. 
proximately equal to the percentage of men x 
switching and (6) significantly greater than th 
percentage of women so switching in all femal, 
groups. 

Although an explanation in terms of women’: 
conformance to correct answers found by the male 
members of the group cannot be ruled out com. 
pletely, an alternative explanation appears stronger 
and consistent with the results of previous work by 
Maier (1933), Carey (1958), and Milton (1958 
The suggested hypothesis is that the women ip 
mixed sex groups were more motivated to solve 
the problem than were the women in all female 
groups. 
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INTROVERSION-EXTRAVERSION DIFFERENCES IN 
JUDGMENTS OF TIME 


R. LYNN 
University of Exeter, England 


n attempt has been made by Claridge (1960) 
A to apply Eysenck’s (1957) theory of intro- 
version-extraversion to the problem of time errors. 
The time error is a constant error that occurs in 
psychophysical judgments, and consists of syste- 
matic overestimation (negative time errors) or 
underestimation (positive time errors) of the 
second of two identical stimuli. The hypothesis 
advanced by Claridge is .as_follows-—the~first 
stimulus produces both excitatory and inhibitory 
effects, the inhibitory effects reducing its perceived 
intensity or duration. This hypothesis accounts for 
the preponderance of negative time errors that 
most investigators have reported. It also follows 
from this hypothesis, taken in conjunction with 
Eysenck’s postulate that extraverts generate reac- 
tive inhibition more quickly and dissipate it more 
slowly than introverts, that extraverts should 
show larger negative time errors than introverts. 
Claridge verified this hypothesis using judgments 
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error becomes cumulative with this method. The 
present paper reports the results of an investiga- 
tion designed to examine introversion-extraversion 
differences in estimations of time using the 
Llewellyn-Thomas technique. 


METHOD 


The subjects were two groups of 20 introverted and 
20 extraverted male university students./Introversion- 
extraversion was assessed by the Maudsley Personality 
Inventory (Eysenck, 1959a). The introverts scored 
between 6-18 on this scale and the extraverts between 
29-44. 

The procedure follows that of Llewellyn-Thomas 
(1959). The apparatus consisted of a light that could 
be switched on with two keys, one of which was held 
by the experimenter and the other by the subject. The 
experimenter told the subject that he would switch 
on the light for a brief interval and following this 
the subject should switch on the light and attempt to 
keep it on for the same interval of time. Fhe subject 
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MEAN Time EsTIMATIONS OF INTROVERT AND EXTRAVERT Groups UsING THE LLEWELLYN-THOMAS 


PosITIVE FEEDBACK TECHNIQUE 

















Trials 
1 $s £ &@ Ff @ f &£ 6 2 2 9 10 
Introverts | | 
M 15.2 14.5 14.2 12.7 12.3 13.9 15.5 17.7 | 17.5 | 18.4 
SD 42 | 2.6] 63 | 41 | 65] 63 | 7.6 | 7.0 | 9.4 | 11.4 
Extraverts 
M 15.4 15.0 13.5 23.0 | &2.5 11.8 | 11.6 | 11.3 10.4 | 10.2 
SD 5.3 4.5 5.1 A ies 8.7 8.7 | 89 | 8.7 | 9.6 
Value of ¢ ns ns ms | ms ns ms | ms 2.46") 2. 21°! 2.40* 








* Significant at the .05 level. 


f intensity of sound and duration of time. A 
tendency for hysterics (extraverted neurotics) to 
show greater negative time errors than dys- 
thymics (introverted neurotics) in judgments of 
time intervals has also been reported by Eysenck 
1959b). 

Recently Llewellyn-Thomas (1959) has pub- 
lished a new procedure for obtaining time judg- 
ments designed to maximize individual differences. 
This procedure involves the use of a positive feed- 
back technique in which the subject is required to 
make a judgment of a standard and is then pre- 
sented with his judgment as his new standard and 
© on over a number of trials. Any tendency to 
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was told that there would be a number of trials which 
might differ in length and he was asked not to count or 
use any other aids to time estimation. The first stand- 
ard was 15 seconds, and there were nine further trials 
in which the subject was successively given his last 
judgment as his new standard. The interval between 
trials was approximately 5 seconds. 


RESULTS 
The results are given in Table 1. It will be noted 
that there are no differences between introverts 
and extraverts on the first five trials, but that dif- 


ferences in the direction predicted by Eysenck’s 
theory emerge from Trial 6 and become statisti- 
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cally significant at the .05 level on the last three 
trials. The results do not replicate the findings of 
Claridge and Eysenck ~xactly, since they found 
significant introversion-extraversion differences on 
a single trial. It is not easy to account for this dis- 
crepancy, since the procedures used appear to have 
been the same. Nevertheless, the hypothesis that 
introversion-extraversion differences in time judg- 
ments do exist receives some support from the 
present experiment. Further, Eysenck’s theory 
that these introversion-extraversion differences 
reflect differences in the generation of reactive 
inhibition entails the prediction that the differ- 
ences would become greater as the trials proceed, 
since reactive inhibition would not dissipate fully 
in the intertrial interval; and it is evident from 
inspection of the table that this also occurred. 
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recent series of papers (Klein & Krech, 1952; 
Krech & Calvin, 1953; Livson & Krech, 
1955, 1956) has developed and elaborated the 
construct of “cortical conductivity” (CC). In 
brief, CC is said to be a parameter of cortical inte- 
gration and, as such, is postulated to be related to 
individual differences in cognitive functioning. 
The kinesthetic aftereffect test (KAE), associated 
with the work of Kéhler and Wallach (1944), has 
been employed as an indirect measure of CC. 
High CC is viewed as being positively related to 
eficient cognitive behavior and inversely related to 
the magnitude of the KAE (Livson & Krech, 
1955). 

The present study provides a direct test of the 
Livson and Krech (1955) hypothesis that vocabu- 
ary scores are negatively related to KAE. In 
addition, since scores on the visual aftereffect test 
VAE) and the KAE are not significantly inter- 
correlated (Spitz & Lipman, 1960), the relationship 
between the VAE and the Vocabulary test will 
also be examined. 


METHOD 


The VAE and KAE apparatus and procedure has 
been detailed elsewhere (Spitz & Lipman, 1960). The 
reliability of these tasks is .73 and .74, respectively. 
The Vocabulary test (Davis & Davis, 1953) was group 
administered and consisted of 60 items in which the 
subject had to select the correct answer from among 
five alternatives. Although governed by a 15 minute 
time limit, all subjects completed the task. 


Subjects and Procedure 


In the first phase of the study, 111 female and 39 
male college sophomore volunteers were administered 
the VAE followed by the KAE. The Vocubulary test 
was administered early the following semester during 
a regular auditorium meeting of the sophomore class. 
Of the subjects on whom KAE and VAE measures 
were obtained, 97 females and 27 males also received 
the Vocabulary test. 


RESULTS AND DISCUSSION 


Since Rechtschaffen and Bookbinder (1960) 
have reported significantly larger aftereffects for 
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males than females, these data were examined 
separately by sex. The normative data on the 
Vocabulary, KAE, and VAE tests are presented in 
Table 1. Pearson r’s between the Vocabulary and 
aftereffects measures are presented in Table 2. 
These results clearly do not support the Livson 
and Krech hypothesis. It is possible, of course, that 
a more direct measure of CC, i.e., some physio- 
logical measure (Krech, Rosenzweig, Bennett, & 
Krueckel, 1954) would yield different findings. At 
present, however, the Livson and Krech (1956) 
proposal that Spearman’s “g”’ is “linked” with the 
construct of CC seems highly tenuous. 


TABLE 1 


MEAN AND SD FOR THE VOCABULARY, 
VAE, AND KAE Tests 








| 

| KAE d 

- Vocabulary (in mm.) (in mm.) 

Mersrmrririr 

mM | sD| uw |SD| M | SD 
Females 97 | 37.40, 6.70 | 2.00 | 3.05 | 4.55 | 3.37 
Males | 27 35.37) 8.19 | 3.25 | 2.73 | 5.85 | 3.71 

' ' j 

TABLE 2 


PEARSON CORRELATIONS BETWEEN VOCABULARY 
AND THE K, AE AND p VA AE TEstTs 

















| Females Males 
(97) (27) 
KAE | VAE | KAE | VAE 
Vocabulary | +.16) +.08 | —.12| +.06 





Note.—None of the correlations are significant at 
the .05 level of confidence. 
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